Complex SQL Select query with inner join - mysql

My SQL query needs to return a list of values alongside the date, but with my limited knowledge I have only been able to get this far.
This is my SQL:
select lsu_students.student_grouping,lsu_attendance.class_date,
count(lsu_attendance.attendance_status) AS count
from lsu_attendance
inner join lsu_students
ON lsu_students.student_grouping="Central1A"
and lsu_students.student_id=lsu_attendance.student_id
where lsu_attendance.attendance_status="Present"
and lsu_attendance.class_date="2015-02-09";
This returns:
student_grouping class_date count
Central1A 2015-02-09 23
I want it to return:
student_grouping class_date count
Central1A 2015-02-09 23
Central1A 2015-02-10 11
Central1A 2015-02-11 21
Central1A 2015-02-12 25
This query gets the list of the dates according to the student grouping:
select distinct(class_date)from lsu_attendance,lsu_students
where lsu_students.student_grouping like "Central1A"
and lsu_students.student_id = lsu_attendance.student_id
order by class_date

I think you just want a group by:
select s.student_grouping, a.class_date, count(a.attendance_status) AS count
from lsu_attendance a inner join
lsu_students s
ON s.student_grouping = 'Central1A' and
s.student_id = a.student_id
where a.attendance_status = 'Present'
group by s.student_grouping, a.class_date;
Comments:
Using single quotes for string constants, unless you have a good reason.
If you want a range of class dates, then use a where with appropriate filtering logic.
Notice the table aliases. The query is easier to write and to read.
I added student grouping to the group by. This would be required by any SQL engine other than MySQL.

Just take out and lsu_attendance.class_date="2015-02-09" or change it to a range, and then add (at the end) GROUP BY lsu_students.student_grouping,lsu_attendance.class_date.

The group by clause is what you're looking for, to limit aggregates (e.g. the count function) to work within each group.
To get the number of students present in each group on each date, you would do something like this:
select student_grouping, class_date, count(*) as present_count
from lsu_students join lsu_attendance using (student_id)
where attendance_status = 'Present'
group by student_grouping, class_date
Note: for your example, using is simpler than on (if your SQL supports it), and putting the table name before each field name isn't necessary if the column name doesn't appear in more than one table (though it doesn't hurt).
If you want to limit which data rows get included, put your constraints get in the where clause (this constrains which rows are counted). If you want to constrain the aggregate values that are displayed, you have to use the having clause. For example, to see the count of Central1A students present each day, but only display those dates where more than 20 students showed up:
select student_grouping, class_date, count(*) as present_count
from lsu_students join lsu_attendance using (student_id)
where attendance_status = 'Present' and student_grouping = 'Central1A'
group by student_grouping, class_date
having count(*) > 20

Related

SQL query help, data is now showing in 1 column

I am facing a problem in composing a SQL query
Sculptor (SRID, SR_FName, SR_LName, SR_DOB)
Sculpture (SEID, SE_Name, SE_Value, SRID*, SID*)
Model (MID, M_FName, M_LName, M_Salary)
List the sculptor last name and total sculpture value (as one field named “Total
Sculpture Value”) of all the sculptures that each sculptor has sculpted (rounded to two decimal places), sorted in ascending order by total value. Only list those sculptors whose total sculpture value is less than 100,000.
My query is:
SELECT SR_LName,
SE_Value AS TotalSculptureValue
FROM Sculptor
JOIN Sculpture USING (SRID)
HAVING ( SE_Value ) < 100000
You do need to learn to use aggregate functions, but here's how this one would work:
SELECT SR_LName,
sum(SE_Value) AS "TotalSculptureValue"
FROM Sculptor
JOIN Sculpture USING (SRID)
group by SR_LName
Try running that query first. You should see all of the sculptor names with the sum of their values.
Then add the HAVING back in, because that's how you restrict the results of aggregate functions:
SELECT SR_LName,
sum(SE_Value) AS "TotalSculptureValue"
FROM Sculptor
JOIN Sculpture USING (SRID)
group by SR_LName
having sum(SE_Value) < 100000
HAVING is different from WHERE. If you want to restrict the non-aggregated columns, like name, you would use WHERE:
SELECT SR_LName,
sum(SE_Value) AS "TotalSculptureValue"
FROM Sculptor
JOIN Sculpture USING (SRID)
where SR_LName like '%Ali%'
group by SR_LName
having sum(SE_Value) < 100000
This would give you all sculptors with names that contain Ali and have total value of less than 100000.

MS Access count query does not produce wanted results

I have a table (tblExam) showing exam data score designed as follow:
Exam Name: String
Score: number(pecent)
Basically I am trying to pull the records by Exam name where the score are less than a specific amount (0.695 in my case).
I am using the following statement to get the results:
SELECT DISTINCTROW tblExam.name, Count(tblExam.name) AS CountOfName
FROM tblExam WHERE (((tblExam.Score)<0.695))
GROUP BY tblExam.name;
This works fine but does not display the exam that have 0 records more than 0.695; in other words I am getting this:
Exam Name count
firstExam 2
secondExam 1
thirdExam 3
The count of 0 and any exams with score above 0.695 do not show up. What I would like is something like this:
Exam Name count
firstExam 2
secondExam 1
thirdExam 3
fourthExam 0
fifthExam 0
sixthExam 2
.
..
.etc...
I hope that I am making sense here. I think that I need somekind of LEFT JOIN to display all of the exam name but I can not come up with the proper syntax.
It seems you want to display all name groups and, within each group, the count of Score < 0.695. So I think you should move < 0.695 from the WHERE to the Count() expression --- actually remove the WHERE clause.
SELECT
e.name,
Count(IIf(e.Score < 0.695, 1, Null)) AS CountOfName
FROM tblExam AS e
GROUP BY e.name;
That works because Count() counts only non-Null values. You could use Sum() instead of Count() if that seems clearer:
Sum(IIf(e.Score < 0.695, 1, 0)) AS CountOfName
Note DISTINCTROW is not useful in a GROUP BY query, because the grouping makes the rows unique without it. So I removed DISTINCTROW from the query.
Do I detect a contradiction? The query calls for results <0.695 but your text says you are also looking for results >0.695. Perhaps I don't understand. Does this give you what you are looking for:
SELECT DISTINCTROW tblExam.ExamName, Count(tblExam.ExamName) AS CountOfExamName
FROM tblExam
WHERE (((tblExam.Score)<0.695 Or (tblExam.Score)>0.695))
GROUP BY tblExam.ExamName;

MySQL ORDER BY Column = value AND distinct?

I'm getting grey hair by now...
I have a table like this.
ID - Place - Person
1 - London - Anna
2 - Stockholm - Johan
3 - Gothenburg - Anna
4 - London - Nils
And I want to get the result where all the different persons are included, but I want to choose which Place to order by.
For example. I want to get a list where they are ordered by LONDON and the rest will follow, but distinct on PERSON.
Output like this:
ID - Place - Person
1 - London - Anna
4 - London - Nils
2 - Stockholm - Johan
Tried this:
SELECT ID, Person
FROM users
ORDER BY FIELD(Place,'London'), Person ASC "
But it gives me:
ID - Place - Person
1 - London - Anna
4 - London - Nils
3 - Gothenburg - Anna
2 - Stockholm - Johan
And I really dont want Anna, or any person, to be in the result more then once.
This is one way to get the specified output, but this uses MySQL specific behavior which is not guaranteed:
SELECT q.ID
, q.Place
, q.Person
FROM ( SELECT IF(p.Person<=>#prev_person,0,1) AS r
, #prev_person := p.Person AS person
, p.Place
, p.ID
FROM users p
CROSS
JOIN (SELECT #prev_person := NULL) i
ORDER BY p.Person, !(p.Place<=>'London'), p.ID
) q
WHERE q.r = 1
ORDER BY !(q.Place<=>'London'), q.Person
This query uses an inline view to return all the rows in a particular order, by Person, so that all of the 'Anna' rows are together, followed by all the 'Johan' rows, etc. The set of rows for each person is ordered by, Place='London' first, then by ID.
The "trick" is to use a MySQL user variable to compare the values from the current row with values from the previous row. In this example, we're checking if the 'Person' on the current row is the same as the 'Person' on the previous row. Based on that check, we return a 1 if this is the "first" row we're processing for a a person, otherwise we return a 0.
The outermost query processes the rows from the inline view, and excludes all but the "first" row for each Person (the 0 or 1 we returned from the inline view.)
(This isn't the only way to get the resultset. But this is one way of emulating analytic functions which are available in other RDBMS.)
For comparison, in databases other than MySQL, we could use SQL something like this:
SELECT ROW_NUMBER() OVER (PARTITION BY t.Person ORDER BY
CASE WHEN t.Place='London' THEN 0 ELSE 1 END, t.ID) AS rn
, t.ID
, t.Place
, t.Person
FROM users t
WHERE rn=1
ORDER BY CASE WHEN t.Place='London' THEN 0 ELSE 1 END, t.Person
Followup
At the beginning of the answer, I referred to MySQL behavior that was not guaranteed. I was referring to the usage of MySQL User-Defined variables within a SQL statement.
Excerpts from MySQL 5.5 Reference Manual http://dev.mysql.com/doc/refman/5.5/en/user-variables.html
"As a general rule, other than in SET statements, you should never assign a value to a user variable and read the value within the same statement."
"For other statements, such as SELECT, you might get the results you expect, but this is not guaranteed."
"the order of evaluation for expressions involving user variables is undefined."
Try this:
SELECT ID, Place, Person
FROM users
GROUP BY Person
ORDER BY FIELD(Place,'London') DESC, Person ASC;
You want to use group by instead of distinct:
SELECT ID, Person
FROM users
GROUP BY ID, Person
ORDER BY MAX(FIELD(Place, 'London')), Person ASC;
The GROUP BY does the same thing as SELECT DISTINCT. But, you are allowed to mention other fields in clauses such as HAVING and ORDER BY.

generate a mean for a 2-uple with MySQL

I can generate a table from records like that :
ID|Var1|Var2|Measure
1 10 13 10
1 10 15 8
1 15 13 0
...
One ID can have several Var2 that are identical. How I can generate a mean for each 2-uple ID-Var2 like that :
ID|Var2|Mean_Measure
1 13 5
1 14 8
...
2 13 7
Thank you
You would need to use a GROUP BY clause to group the rows with the same ID and Var2 together and then the AVG function calculates the average:
SELECT t.ID, t.Var2, AVG(t.Measure) AS Mean_Measure FROM YourTable t GROUP BY t.ID, t.Var2
I might add that GROUP BY will alter the output of the query quite a bit. It also adds some restrictions on the output. First off - after a group by you can only add expressions in the SELECT clause where one the following applies:
The expression is part of the GROUP BY clause
The expression is an application of an aggregate function
In the above example t.ID and t.Var2 exists in the GROUP BY clause and AVG(t.Measure) is an application of the aggregate function AVG on t.Measure.
When dealing with WHERE clauses and GROUP BY there's also some things to note:
WHERE is applied after the GROUP BY this means generally that expressions not in GROUP BY cannot be used in the WHERE clause
If you wish to filter data before the GROUP BY use HAVING instead of WHERE
I hope this makes sense - and for more and better information on how GROUP BYs work - I'd suggest consulting the MySQL manual on the topic.

Mysql subquery with sum causing problems

This is a summary version of the problems I am encountering, but hits the nub of my problem. The real problem involves huge UNION groups of monthly data tables, but the SQL would be huge and add nothing. So:
SELECT entity_id,
sum(day_call_time) as day_call_time
from (
SELECT entity_id,
sum(answered_day_call_time) as day_call_time
FROM XCDRDNCSum201108
where (day_of_the_month >= 10 AND day_of_the_month<=24)
and LPAD(core_range,4,"0")="0987"
and LPAD(subrange,3,"0")="654"
and SUBSTR(LPAD(core_number,7,"0"),4,7)="3210"
) as summary
is the problem: when the table in the subquery XCDRDNCSum201108 returns no rows, because it is a sum, the column values contain null. And entity_id is part of the primary key, and cannot be null.
If I take out the sum, and just query entity_id, the subquery contains no rows, and thus the outer query does not fail, but when I use sum, I get error 1048 Column 'entity_id' cannot be null
how do I work around this problem ? Sometimes there is no data.
You are completely overworking the query... pre-summing inside, then summing again outside. In addition, I understand you are not a DBA, but if you are ever doing an aggregation, you TYPICALLY need the criteria that its grouped by. In the case presented here, you are getting sum of calls for all entity IDs. So you must have a group by any non-aggregates. However, if all you care about is the Grand total WITHOUT respect to the entity_ID, then you could skip the group by, but would also NOT include the actual entity ID...
If you want inclusive to show actual time per specific entity ID...
SELECT
entity_id,
sum(answered_day_call_time) as day_call_time,
count(*) number_of_calls
FROM
XCDRDNCSum201108
where
(day_of_the_month >= 10 AND day_of_the_month<=24)
and LPAD(core_range,4,"0")="0987"
and LPAD(subrange,3,"0")="654"
and SUBSTR(LPAD(core_number,7,"0"),4,7)="3210"
group by
entity_id
This would result in something like (fictitious data)
Entity_ID Day_Call_Time Number_Of_Calls
1 10 3
2 45 4
3 27 2
If all you cared about were the total call times
SELECT
sum(answered_day_call_time) as day_call_time,
count(*) number_of_calls
FROM
XCDRDNCSum201108
where
(day_of_the_month >= 10 AND day_of_the_month<=24)
and LPAD(core_range,4,"0")="0987"
and LPAD(subrange,3,"0")="654"
and SUBSTR(LPAD(core_number,7,"0"),4,7)="3210"
This would result in something like (fictitious data)
Day_Call_Time Number_Of_Calls
82 9
Would:
sum(answered_day_call_time) as day_call_time
changed to
ifnull(sum(answered_day_call_time),0) as day_call_time
work? I'm assuming mysql here but the coalesce function would/should work too.