I have a table (tblExam) showing exam data score designed as follow:
Exam Name: String
Score: number(pecent)
Basically I am trying to pull the records by Exam name where the score are less than a specific amount (0.695 in my case).
I am using the following statement to get the results:
SELECT DISTINCTROW tblExam.name, Count(tblExam.name) AS CountOfName
FROM tblExam WHERE (((tblExam.Score)<0.695))
GROUP BY tblExam.name;
This works fine but does not display the exam that have 0 records more than 0.695; in other words I am getting this:
Exam Name count
firstExam 2
secondExam 1
thirdExam 3
The count of 0 and any exams with score above 0.695 do not show up. What I would like is something like this:
Exam Name count
firstExam 2
secondExam 1
thirdExam 3
fourthExam 0
fifthExam 0
sixthExam 2
.
..
.etc...
I hope that I am making sense here. I think that I need somekind of LEFT JOIN to display all of the exam name but I can not come up with the proper syntax.
It seems you want to display all name groups and, within each group, the count of Score < 0.695. So I think you should move < 0.695 from the WHERE to the Count() expression --- actually remove the WHERE clause.
SELECT
e.name,
Count(IIf(e.Score < 0.695, 1, Null)) AS CountOfName
FROM tblExam AS e
GROUP BY e.name;
That works because Count() counts only non-Null values. You could use Sum() instead of Count() if that seems clearer:
Sum(IIf(e.Score < 0.695, 1, 0)) AS CountOfName
Note DISTINCTROW is not useful in a GROUP BY query, because the grouping makes the rows unique without it. So I removed DISTINCTROW from the query.
Do I detect a contradiction? The query calls for results <0.695 but your text says you are also looking for results >0.695. Perhaps I don't understand. Does this give you what you are looking for:
SELECT DISTINCTROW tblExam.ExamName, Count(tblExam.ExamName) AS CountOfExamName
FROM tblExam
WHERE (((tblExam.Score)<0.695 Or (tblExam.Score)>0.695))
GROUP BY tblExam.ExamName;
Related
My SQL query needs to return a list of values alongside the date, but with my limited knowledge I have only been able to get this far.
This is my SQL:
select lsu_students.student_grouping,lsu_attendance.class_date,
count(lsu_attendance.attendance_status) AS count
from lsu_attendance
inner join lsu_students
ON lsu_students.student_grouping="Central1A"
and lsu_students.student_id=lsu_attendance.student_id
where lsu_attendance.attendance_status="Present"
and lsu_attendance.class_date="2015-02-09";
This returns:
student_grouping class_date count
Central1A 2015-02-09 23
I want it to return:
student_grouping class_date count
Central1A 2015-02-09 23
Central1A 2015-02-10 11
Central1A 2015-02-11 21
Central1A 2015-02-12 25
This query gets the list of the dates according to the student grouping:
select distinct(class_date)from lsu_attendance,lsu_students
where lsu_students.student_grouping like "Central1A"
and lsu_students.student_id = lsu_attendance.student_id
order by class_date
I think you just want a group by:
select s.student_grouping, a.class_date, count(a.attendance_status) AS count
from lsu_attendance a inner join
lsu_students s
ON s.student_grouping = 'Central1A' and
s.student_id = a.student_id
where a.attendance_status = 'Present'
group by s.student_grouping, a.class_date;
Comments:
Using single quotes for string constants, unless you have a good reason.
If you want a range of class dates, then use a where with appropriate filtering logic.
Notice the table aliases. The query is easier to write and to read.
I added student grouping to the group by. This would be required by any SQL engine other than MySQL.
Just take out and lsu_attendance.class_date="2015-02-09" or change it to a range, and then add (at the end) GROUP BY lsu_students.student_grouping,lsu_attendance.class_date.
The group by clause is what you're looking for, to limit aggregates (e.g. the count function) to work within each group.
To get the number of students present in each group on each date, you would do something like this:
select student_grouping, class_date, count(*) as present_count
from lsu_students join lsu_attendance using (student_id)
where attendance_status = 'Present'
group by student_grouping, class_date
Note: for your example, using is simpler than on (if your SQL supports it), and putting the table name before each field name isn't necessary if the column name doesn't appear in more than one table (though it doesn't hurt).
If you want to limit which data rows get included, put your constraints get in the where clause (this constrains which rows are counted). If you want to constrain the aggregate values that are displayed, you have to use the having clause. For example, to see the count of Central1A students present each day, but only display those dates where more than 20 students showed up:
select student_grouping, class_date, count(*) as present_count
from lsu_students join lsu_attendance using (student_id)
where attendance_status = 'Present' and student_grouping = 'Central1A'
group by student_grouping, class_date
having count(*) > 20
I'm getting grey hair by now...
I have a table like this.
ID - Place - Person
1 - London - Anna
2 - Stockholm - Johan
3 - Gothenburg - Anna
4 - London - Nils
And I want to get the result where all the different persons are included, but I want to choose which Place to order by.
For example. I want to get a list where they are ordered by LONDON and the rest will follow, but distinct on PERSON.
Output like this:
ID - Place - Person
1 - London - Anna
4 - London - Nils
2 - Stockholm - Johan
Tried this:
SELECT ID, Person
FROM users
ORDER BY FIELD(Place,'London'), Person ASC "
But it gives me:
ID - Place - Person
1 - London - Anna
4 - London - Nils
3 - Gothenburg - Anna
2 - Stockholm - Johan
And I really dont want Anna, or any person, to be in the result more then once.
This is one way to get the specified output, but this uses MySQL specific behavior which is not guaranteed:
SELECT q.ID
, q.Place
, q.Person
FROM ( SELECT IF(p.Person<=>#prev_person,0,1) AS r
, #prev_person := p.Person AS person
, p.Place
, p.ID
FROM users p
CROSS
JOIN (SELECT #prev_person := NULL) i
ORDER BY p.Person, !(p.Place<=>'London'), p.ID
) q
WHERE q.r = 1
ORDER BY !(q.Place<=>'London'), q.Person
This query uses an inline view to return all the rows in a particular order, by Person, so that all of the 'Anna' rows are together, followed by all the 'Johan' rows, etc. The set of rows for each person is ordered by, Place='London' first, then by ID.
The "trick" is to use a MySQL user variable to compare the values from the current row with values from the previous row. In this example, we're checking if the 'Person' on the current row is the same as the 'Person' on the previous row. Based on that check, we return a 1 if this is the "first" row we're processing for a a person, otherwise we return a 0.
The outermost query processes the rows from the inline view, and excludes all but the "first" row for each Person (the 0 or 1 we returned from the inline view.)
(This isn't the only way to get the resultset. But this is one way of emulating analytic functions which are available in other RDBMS.)
For comparison, in databases other than MySQL, we could use SQL something like this:
SELECT ROW_NUMBER() OVER (PARTITION BY t.Person ORDER BY
CASE WHEN t.Place='London' THEN 0 ELSE 1 END, t.ID) AS rn
, t.ID
, t.Place
, t.Person
FROM users t
WHERE rn=1
ORDER BY CASE WHEN t.Place='London' THEN 0 ELSE 1 END, t.Person
Followup
At the beginning of the answer, I referred to MySQL behavior that was not guaranteed. I was referring to the usage of MySQL User-Defined variables within a SQL statement.
Excerpts from MySQL 5.5 Reference Manual http://dev.mysql.com/doc/refman/5.5/en/user-variables.html
"As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement."
"For other statements, such as SELECT, you might get the results you expect, but this is not guaranteed."
"the order of evaluation for expressions involving user variables is undefined."
Try this:
SELECT ID, Place, Person
FROM users
GROUP BY Person
ORDER BY FIELD(Place,'London') DESC, Person ASC;
You want to use group by instead of distinct:
SELECT ID, Person
FROM users
GROUP BY ID, Person
ORDER BY MAX(FIELD(Place, 'London')), Person ASC;
The GROUP BY does the same thing as SELECT DISTINCT. But, you are allowed to mention other fields in clauses such as HAVING and ORDER BY.
I'm stumped with how to do the following purely in MySQL, and I've resorted to taking my result set and manipulating it in ruby afterwards, which doesn't seem ideal.
Here's the question. With a dataset of 'items' like:
id state_id price issue_date listed
1 5 450 2011 1
1 5 455 2011 1
1 5 490 2011 1
1 5 510 2012 0
1 5 525 2012 1
...
I'm trying to get something like:
SELECT * FROM items
WHERE ([some conditions], e.g. issue_date >= 2011 and listed=1)
AND state_id = 5
GROUP BY id
HAVING AVG(price) <= 500
ORDER BY price DESC
LIMIT 25
Essentially I want to grab a "group" of items whose average price fall under a certain threshold. I know that my above example "group by" and "having" are not correct since it's just going to give the AVG(price) of that one item, which doesn't really make sense. I'm just trying to illustrate my desired result.
The important thing here is I want all of the individual items in my result set, I don't just want to see one row with the average price, total, etc.
Currently I'm just doing the above query without the HAVING AVG(price) and adding up the individual items one-by-one (in ruby) until I reach the desired average. It would be really great if I could figure out how to do this in SQL. Using subqueries or something clever like joining the table onto itself are certainly acceptable solutions if they work well! Thanks!
UPDATE: In response to Tudor's answer below, here are some clarifications. There is always going to be a target quantity in addition to the target average. And we would always sort the results by price low to high, and by date.
So if we did have 10 items that were all priced at $5 and we wanted to find 5 items with an average < $6, we'd simply return the first 5 items. We wouldn't return the first one only, and we wouldn't return the first 3 grouped with the last 2. That's essentially how my code in ruby is working right now.
I would do almost an inverse of what Jasper provided... Start your query with your criteria to explicitly limit the few items that MAY qualify instead of getting all items and running a sub-select on each entry. Could pose as a larger performance hit... could be wrong, but here's my offering..
select
i2.*
from
( SELECT i.id
FROM items i
WHERE
i.issue_date > 2011
AND i.listed = 1
AND i.state_id = 5
GROUP BY
i.id
HAVING
AVG( i.price) <= 500 ) PreQualify
JOIN items i2
on PreQualify.id = i2.id
AND i2.issue_date > 2011
AND i2.listed = 1
AND i2.state_id = 5
order by
i2.price desc
limit
25
Not sure of the order by, especially if you wanted grouping by item... In addition, I would ensure an index on (state_id, Listed, id, issue_date)
CLARIFICATION per comments
I think I AM correct on it. Don't confuse "HAVING" clause with "WHERE". WHERE says DO or DONT include based on certain conditions. HAVING means after all the where clauses and grouping is done, the result set will "POTENTIALLY" accept the answer. THEN the HAVING is checked, and if IT STILL qualifies, includes in the result set, otherwise throws it out. Try the following from the INNER query alone... Do once WITHOUT the HAVING clause, then again WITH the HAVING clause...
SELECT i.id, avg( i.price )
FROM items i
WHERE i.issue_date > 2011
AND i.listed = 1
AND i.state_id = 5
GROUP BY
i.id
HAVING
AVG( i.price) <= 500
As you get more into writing queries, try the parts individually to see what you are getting vs what you are thinking... You'll find how / why certain things work. In addition, you are now talking in your updated question about getting multiple IDs and prices at apparent low and high range... yet you are also applying a limit. If you had 20 items, and each had 10 qualifying records, your limit of 25 would show all of the first item and 5 into the second... which is NOT what I think you want... you may want 25 of each qualified "id". That would wrap this query into yet another level...
What MySQL does makes perfectly sense. What you want to do does not make sense:
if you have let's say 4 items, each with price of 5 and you put HAVING AVERAGE <= 7 what you say is that the query should return ALL the permutations, like:
{1} - since item with id 1, can be a group by itself
{1,2}
{1,3}
{1,4}
{1,2,3}
{1,2,4}
...
and so on?
Your algorithm of computing the average in ruby is also not valid, if you have items with values 5, 1, 7, 10 - and seek for an average value of less than 7, element with value 10 can be returned just in a group with element of value 1. But, by your algorithm (if I understood correctly), element with value 1 is returned in the first group.
Update
What you want is something like the Knapsack problem and your approach is using some kind of Greedy Algorithm to solve it. I don't think there are straight, easy and correct ways to implement that in SQL.
After a google search, I found this article which tries to solve the knapsack problem with AI written in SQL.
By considering your item price as a weight, having the number of items and the desired average, you could compute the maximum value that can be entered in the 'knapsack' by multiplying desired_cost with number_of_items
I'm not entirely sure from your question, but I think this is a solution to your problem:
SELECT * FROM items
WHERE (some "conditions", e.g. issue_date > 2011 and listed=1)
AND state_id = 5
AND id IN (SELECT id
FROM items
GROUP BY id
HAVING AVG(price) <= 500)
ORDER BY price DESC
LIMIT 25
note: This is off the top of my head and I haven't done complex SQL in a while, so it might be wrong. I think this or something like it should work, though.
Evening folks,
I have a complex MySQL COUNT query I am trying to perform and am looking for the best way to do it.
In our system, we have References. Each Reference can have many (or no) Income Sources, each of which can be validated or not (status). We have a Reference table and an Income table - each row in the Income table points back to Reference with reference_id
On our 'Awaiting' page (the screen that shows each Income that is yet to be validated), we show it grouped by Reference. So you may, for example, see Mr John Smith has 3 Income Sources.
We want it to show something like "2 of 3 Validated" beside each row
My problem is writing the query that figures this out!
What I have been trying to do is this, using a combination of PHP and MySQL to bridge the gap where SQL (or my knowledge) falls short:
First, select a COUNT of the number of incomes associated with each reference:
SELECT `reference_id`, COUNT(status) AS status_count
FROM (`income`)
WHERE `income`.`status` = 0
GROUP BY `reference_id`
Next, having used PHP to generate a WHERE IN clause, proceed to COUNT the number of confirmed references from these:
SELECT `reference_id`, COUNT(status) AS status_count
FROM (`income`)
WHERE `reference_id` IN ('8469', '78969', '126613', ..... etc
AND status = 1
GROUP BY `reference_id`
However this doesn't work. It returns 0 rows.
Any way to achieve what I'm after?
Thanks!
In MySQL, you can SUM() on a boolean expression to get a count of the rows where that expression is true. You can do this because MySQL treats true as the integer 1 and false as the integer 0.
SELECT `reference_id`,
SUM(`status` = 1) AS `validated_count`,
COUNT(*) AS `total_count`
FROM `income`
GROUP BY `reference_id`
This is a summary version of the problems I am encountering, but hits the nub of my problem. The real problem involves huge UNION groups of monthly data tables, but the SQL would be huge and add nothing. So:
SELECT entity_id,
sum(day_call_time) as day_call_time
from (
SELECT entity_id,
sum(answered_day_call_time) as day_call_time
FROM XCDRDNCSum201108
where (day_of_the_month >= 10 AND day_of_the_month<=24)
and LPAD(core_range,4,"0")="0987"
and LPAD(subrange,3,"0")="654"
and SUBSTR(LPAD(core_number,7,"0"),4,7)="3210"
) as summary
is the problem: when the table in the subquery XCDRDNCSum201108 returns no rows, because it is a sum, the column values contain null. And entity_id is part of the primary key, and cannot be null.
If I take out the sum, and just query entity_id, the subquery contains no rows, and thus the outer query does not fail, but when I use sum, I get error 1048 Column 'entity_id' cannot be null
how do I work around this problem ? Sometimes there is no data.
You are completely overworking the query... pre-summing inside, then summing again outside. In addition, I understand you are not a DBA, but if you are ever doing an aggregation, you TYPICALLY need the criteria that its grouped by. In the case presented here, you are getting sum of calls for all entity IDs. So you must have a group by any non-aggregates. However, if all you care about is the Grand total WITHOUT respect to the entity_ID, then you could skip the group by, but would also NOT include the actual entity ID...
If you want inclusive to show actual time per specific entity ID...
SELECT
entity_id,
sum(answered_day_call_time) as day_call_time,
count(*) number_of_calls
FROM
XCDRDNCSum201108
where
(day_of_the_month >= 10 AND day_of_the_month<=24)
and LPAD(core_range,4,"0")="0987"
and LPAD(subrange,3,"0")="654"
and SUBSTR(LPAD(core_number,7,"0"),4,7)="3210"
group by
entity_id
This would result in something like (fictitious data)
Entity_ID Day_Call_Time Number_Of_Calls
1 10 3
2 45 4
3 27 2
If all you cared about were the total call times
SELECT
sum(answered_day_call_time) as day_call_time,
count(*) number_of_calls
FROM
XCDRDNCSum201108
where
(day_of_the_month >= 10 AND day_of_the_month<=24)
and LPAD(core_range,4,"0")="0987"
and LPAD(subrange,3,"0")="654"
and SUBSTR(LPAD(core_number,7,"0"),4,7)="3210"
This would result in something like (fictitious data)
Day_Call_Time Number_Of_Calls
82 9
Would:
sum(answered_day_call_time) as day_call_time
changed to
ifnull(sum(answered_day_call_time),0) as day_call_time
work? I'm assuming mysql here but the coalesce function would/should work too.