Adding Row Values when there are no results - MySQL - mysql

Problem Statement: I need my result set to include records that would not naturally return because they are NULL.
I'm going to put some simplified code here since my code seems to be too long.
Table Scores has Company_type, Company, Score, Project_ID
Select Score, Count(Project_ID)
FROM Scores
WHERE company_type= :company_type
GROUP BY Score
Results in the following:
Score Projects
5 95
4 94
3 215
2 51
1 155
Everything is working fine until I apply a condition to company_type that does not include results in one of the 5 score categories. When this happens, I don't have 5 rows in my result set any more.
It displays like this:
Score Projects
5 5
3 6
1 3
I'd like it to display like this:
Score Projects
5 5
4 0
3 6
2 0
1 3
I need the results to always display 5 rows. (Scores = 1-5)
I tried one of the approaches below by Spencer7593. My simplified query now looks like this:
SELECT i.score AS Score, IFNULL(count(*), 0) AS Projects
FROM (SELECT 5 AS score
UNION ALL
SELECT 4
UNION ALL
SELECT 3
UNION ALL
SELECT 2
UNION ALL
SELECT 1) i
LEFT JOIN Scores ON Scores.score = i.score
GROUP BY Score
ORDER BY i.score DESC
And gives the following results, which is accurate except that the rows with 1 in Projects should actually be 0 because they are derived by the "i". There are no projects with a score of 5 or 2.
Score Projects
5 1
4 5
3 6
2 1
1 3
Solved! I just needed to adjust my count to specifically look at the project count - count(project) rather than count(*). This returned the expected results.

If you always want your query to return 5 rows, with Score values of 5,4,3,2,1... you'll need a rowsource that supplies those Score values.
One approach would be to use a simple query to return those fixed values, e.g.
SELECT 5 AS score
UNION ALL SELECT 4
UNION ALL SELECT 3
UNION ALL SELECT 2
UNION ALL SELECT 1
Then use that query as inline view, and do an outer join operation to the results from your current query
SELECT i.score AS `Score`
, IFNULL(q.projects,0) AS `Projects`
FROM ( SELECT 5 AS score
UNION ALL SELECT 4
UNION ALL SELECT 3
UNION ALL SELECT 2
UNION ALL SELECT 1
) i
LEFT
JOIN (
-- the current query with "missing" Score rows goes here
-- for completeness of this example, without the query
-- we emulate that result with a different query
SELECT 5 AS score, 95 AS projects
UNION ALL SELECT 3, 215
UNION ALL SELECT 1, 155
) q
ON q.score = i.score
ORDER BY i.score DESC
It doesn't have to be the view query in this example. But there does need to be a rowsource that the rows can be returned from. You could, for example, have a simple table that contains those five rows, with those five score values.
This is just an example approach for the general approach. It might be possible to modify your existing query to return the rows you want. But without seeing the query, the schema, and example data, we can't tell.
FOLLOWUP
Based on the edit to the question, showing an example of the current query.
If we are guaranteed that the five values of Score will always appear in the Scores table, we could do conditional aggregation, writing a query like this:
SELECT s.score
, COUNT(IF(s.company_type = :company_type,s.project_id,NULL)) AS projects
FROM Scores s
GROUP BY s.score
ORDER BY s.score DESC
Note that this will require a scan of all the rows, so it may not perform as well. The "trick" is the IF function, which returns a NULL value in place of project_id, when the row would have been excluded by the WHERE clause.)
If we are guaranteed that project_id is non-NULL, we could use a more terse MySQL shorthand expression to achieve an equivalent result...
, IFNULL(SUM(s.company_type = :company_type),0) AS projects
This works because MySQL returns 1 when the comparison is TRUE, and otherwisee returns 0 or NULL.

Try something like this:
select distinct score
from (
select distinct score from scores
) s
left outer join (
Select Score, Count(Project_ID) cnt
FROM Scores
WHERE company_type= :company_type
) x
on s.score = x.score

Your posted query would not work without a group by statement. However, even there, if you don't have those particular scores for that company type, it wouldn't work either.
One option is to use an outer join. That would require a little more work though.
Here's another option using conditional aggregation:
select Score, sum(company_type=:company_type)
from Scores
group by Score

Related

Possible to count number of occurrences in a "group" in MySQL?

Sorry if the title is misleading, I don't really know the terminology for what I want to accomplish. But let's consider this table:
CREATE TABLE entries (
id INT NOT NULL,
number INT NOT NULL
);
Let's say it contains four numbers associated with each id, like this:
id number
1 0
1 9
1 17
1 11
2 5
2 8
2 9
2 0
.
.
.
Is it possible, with a SQL-query only, to count the numbers of matches for any two given numbers (tuples) associated with a id?
Let's say I want to count the number of occurrences of number 0 and 9 that is associated with a unique id. In the sample data above 0 and 9 does occur two times (one time where id=1 and one time where id=2). I can't think of how to write a SQL-query that solves this. Is it possible? Maybe my table structure is wrong, but that's how my data is organized right now.
I have tried sub-queries, unions, joins and everything else, but haven't found a way yet.
You can use GROUP BY and HAVING clauses:
SELECT COUNT(s.id)
FROM(
SELECT t.id
FROM YourTable t
WHERE t.number in(0,9)
GROUP BY t.id
HAVING COUNT(distinct t.number) = 2) s
Or with EXISTS():
SELECT COUNT(distinct t.id)
FROM YourTable t
WHERE EXISTS(SELECT 1 FROM YourTable s
WHERE t.id = s.id and s.id IN(0,9)
HAVING COUNT(distinct s.number) = 2)

SQL Create variables from 2 different tables

I have 2 different tables called observations and intervals.
observations:
id,
type,
date
1 recess 03.05.2011 17:00
2 recess 03.06.2011 12:00
intervals:
id,
observation id,
value
1 1 5
2 1 8
3 2 4
4 2 4
I want a view that will display:
observation_id
percent_positive ((count where value = 5)/(total number of observations))
1 .5
2 0
I know
Select observation_id, Count(*) from intervals where value = 5 Group by
observation_id
will give me:
1 1
1 0
and
Select observation_id, Count(*) from intervals Group by
observation_id
will give me:
1 2
2 2
So how do I combine these to create a view with the percent_positive variable I'm looking for?
You can use joins to fetch data from two tables having a common column field .
For more ,please read it in detail Multiple values from multiple Tables
This gave me your desired result. Not proficient enough in SQL to determine if this is the optimal way of solving the issue though.
SELECT
observation_id as obs,
(SELECT COUNT(*) FROM intervals WHERE observation_id = obs AND value = 5)/(SELECT COUNT(*) FROM INTERVALS WHERE observation_id = obs) as percent
FROM observation
JOIN intervals ON observation.id = intervals.observation_id
GROUP BY observation_id;
SELECT
i.observation_id,
SUM(IF(i.value=5,1,0)) / counts.num as 'percent_positive'
FROM intervals i
inner join (
select observation_id, count(1) as num from intervals group by observation_id
) counts on counts.observation_id = i.observation_id
group by i.observation_id
order by i.observation_id
;
That oughta get you close, can't actually run to test at the moment. I'm not sure about the significance of the value 5 meaning positive, so the i.value=5 test might need to be modified to do what you want. This will only include observation IDs with intervals that refer to them; if you want ALL observations then you'll need to join that table (select from that table and left join the others, to be precise). Of course the percentage for those IDs will be 0 divided by 0 anyway...

SQL query for selecting maximum from 2 different columns

I got a question in my homework for SQL about selecting the maximum values from the same table that have different class "Letters"
For example:
ID Student Group Avg(value)
-------------------------------------
1 stud1 A 9
2 stud2 A 9.5
3 stud3 B 8
4 stud4 B 8.5
What my query should do, is to show stud2 and stud4.The maximum from their respective groups.
I managed to do it in the end, but it took a lot of characters so I thought that maybe there's a shorter way to do. Any ideas? I used to first search the id or the stud that has max avg(value) from group A, intersecting with the id of the stud that has max avg(value) from B and then putting everything into one big select and then using those intersected IDs into another query that requested to show some different things about those IDs. But as I said, it looked far too long and thought that maybe there's an shorter way.
Try this (I renamed group to grp and avg to avg_val as those are reserved keywords):
select t1.*
from your_table t1
inner join (
select grp, max(avg_val) avg_val
from your_table
group by grp
) t2 on t1.grp = t2.grp
and t1.avg_val = t2.avg_val;
It finds maximum avg value per group and joins it with original table to get the corresponding students.
Please note that if there are multiple students with same avg as the max value of the that group, all of those students will be returned.

MySQL : Group By Clause Not Using Index when used with Case

Im using MySQL
I cant change the DB structure, so thats not an option sadly
THE ISSUE:
When i use GROUP BY with CASE (as need in my situation), MYSQL uses
file_sort and the delay is humongous (approx 2-3minutes):
http://sqlfiddle.com/#!9/f97d8/11/0
But when i dont use CASE just GROUP BY group_id , MYSQL easily uses
index and result is fast:
http://sqlfiddle.com/#!9/f97d8/12/0
Scenerio: DETAILED
Table msgs, containing records of sent messages, with fields:
id,
user_id, (the guy who sent the message)
type, (0=> means it's group msg. All the msgs sent under this are marked by group_id. So lets say group_id = 5 sent 5 msgs, the table will have 5 records with group_id =5 and type=0. For type>0, the group_id will be NULL, coz all other types have no group_id as they are individual msgs sent to single recipient)
group_id (if type=0, will contain group_id, else NULL)
Table contains approx 10 million records for user id 50001 and with different types (i.e group as well as individual msgs)
Now the QUERY:
SELECT
msgs.*
FROM
msgs
INNER JOIN accounts
ON (
msgs.user_id = accounts.id
)
WHERE 1
AND msgs.user_id IN (50111)
AND msgs.type IN (0, 1, 5, 7)
GROUP BY CASE `msgs`.`type` WHEN 0 THEN `msgs`.`group_id` ELSE `msgs`.`id` END
ORDER BY `msgs`.`group_id` DESC
LIMIT 100
I HAVE to get summary in a single QUERY,
so msgs sent to group lets say 5 (have 5 records in this table) will be shown as 1 record for summary (i may show COUNT later, but thats not an issue).
The individual msgs have NULL as group_id, so i cant just put 'GROUP BY group_id ' coz that will Group all individual msgs to single record which is not acceptable.
Sample output can be something like:
id owner_id, type group_id COUNT
1 50001 0 2 5
1 50001 1 NULL 1
1 50001 4 NULL 1
1 50001 0 7 5
1 50001 5 NULL 1
1 50001 5 NULL 1
1 50001 5 NULL 1
1 50001 0 10 5
Now the problem is that the GROUP condition after using CASE (which i currently think that i have to because i only need to group by group_id if type=0) is causing alot of delay coz it's not using indexes which it does if i dont use CASE (like just group by group_id ). Please view SQLFiddles above to see the explain results
Can anyone plz give an advice how to get it optimized
UPDATE
I tried a workaround , that does somehow works out (drops INITIAL queries to 1sec). Using union, what it does is, to minimize the resultset by union that forces SQL to write on disk for filesort (due to huge resultset), limit the resultset of group msgs, and individual msgs (view query below)
-- first part of union retrieves group msgs (that have type 0 and needs to be grouped by group_id). Applies the limit to captivate the out of control result set
-- The second query retrieves individual msgs, (those with type !=0, grouped by msgs.id - not necessary but just to be save from duplicate entries due to joins). Applies the limit to captivate the out of control result set
-- JOins the two to retrieve the desired resultset
Here's the query:
SELECT
*
FROM
(
(
SELECT
msgs.id as reference_id, user_id, type, group_id
FROM
msgs
INNER JOIN accounts
ON (msgs.user_id = accounts.id)
WHERE 1
AND accounts.id IN (50111 ) AND type = 0
GROUP BY msgs.group_id
ORDER BY msgs.id DESC
LIMIT 40
)
UNION
ALL
(
SELECT
msgs.id as reference_id, user_id, type, group_id
FROM
msgs
INNER JOIN accounts
ON (
msgs.user_id = accounts.id
)
WHERE 1
AND msgs.type != 0
AND accounts.id IN (50111)
GROUP BY msgs.id
ORDER BY msgs.id
LIMIT 40
)
) AS temp
ORDER BY reference_id
LIMIT 20,20
But has alot of caveats,
-I need to handle the limit in inner queries as well. Lets say 20recs per page, and im on page 4. For inner queries , i need to apply limit 0,80, since im not sure which of the two parts had how many records in the previous 3 pages. So, as the records per page and number of pages grow, my query grows heavier. Lets say 1k rec per page, and im on page 100 , or 1K, the load gets heavier and time exponentially increases
I need to handle ordering in inner queries and then apply on the resultset prepared by union , conditions need to be applied on both inner queries seperately(but not much of an issue)
-Cant use calc_found_rows, so will need to get count using queries seperately
The main issue is the first one. The higher i go with the pagination , the heavier it gets
Would this run faster?
SELECT id, user_id, type, group_id
FROM
( SELECT id, user_id, type, group_id, IFNULL(group_id, id) AS foo
FROM msgs
WHERE user_id IN (50111)
AND type IN (0, 1, 5, 7)
)
GROUP BY foo
ORDER BY `group_id` DESC
LIMIT 100
It needs INDEX(user_id, type).
Does this give the 'correct' answer?
SELECT DISTINCT *
FROM msgs
WHERE user_id IN (50111)
AND type IN (0, 1, 5, 7)
GROUP BY IFNULL(group_id, id)
ORDER BY `group_id` DESC
LIMIT 100
(It needs the same index)

MySQL Conditional count based on a value in another column

I have table that looks like this:
id rank
a 2
a 1
b 4
b 3
c 7
d 1
d 1
e 9
I need to get all the distinct rank values on one column and count of all the unique id's that have reached equal or higher rank than in the first column.
So the result I need would be something like this:
rank count
1 5
2 4
3 3
4 3
7 2
9 1
I've been able to make a table with all the unique id's with their max rank:
SELECT
MAX(rank) AS 'TopRank',
id
FROM myTable
GROUP BY id
I'm also able to get all the distinct rank values and count how many id's have reached exactly that rank:
SELECT
DISTINCT TopRank AS 'rank',
COUNT(id) AS 'count of id'
FROM
(SELECT
MAX(rank) AS 'TopRank',
id
FROM myTable
GROUP BY id) tableDerp
GROUP BY TopRank
ORDER BY TopRank ASC
But I don't know how to get count of id's where the rank is equal OR HIGHER than the rank in column 1. Trying SUM(CASE WHEN TopRank > TopRank THEN 1 END) naturally gives me nothing. So how can I get the count of id's where the TopRank is higher or equal to each distinct rank value? Or am I looking in the wrong way and should try something like running totals instead? I tried to look for similar questions but I think I'm completely on a wrong trail here since I couldn't find any and this seems a pretty simple problem that I'm just overthinking somehow. Any help much appreciated.
One approach is to use a correlated subquery. Just get the list of ranks and then use a correlated subquery to get the count you are looking for:
SELECT r.rank,
(SELECT COUNT(DISTINCT t2.id)
FROM myTable t2
WHERE t2.rank >= r.rank
) as cnt
FROM (SELECT DISTINCT rank FROM myTable) r;