I have the follow SQL:
SELECT tbl_G_stats_atp.PK_G, tbl_G_stats_atp.InjuryCnt
FROM tbl_G_stats_atp
WHERE (((tbl_G_stats_atp.ID_A)=89) AND ((tbl_G_stats_atp.DATE_S)<37500))
GROUP BY tbl_G_stats_atp.PK_G, tbl_G_stats_atp.InjuryCnt;
It produces this result:
+---------+-----------+
| PK_G | InjuryCnt |
+---------+-----------+
| 1203857 | 0 |
| 1203881 | 0 |
| 1203890 | 0 |
| 1203913 | 0 |
| 1203916 | 0 |
| 1203989 | 0 |
| 1204001 | 0 |
| 1204102 | 0 |
| 1204172 | 0 |
+---------+-----------+
I want to select the last record so have used this SQL:
SELECT Last(tbl_G_stats_atp.PK_G) AS LastOfPK_G, tbl_G_stats_atp.InjuryCnt
FROM tbl_G_stats_atp
WHERE (((tbl_G_stats_atp.ID_A)=89) AND ((tbl_G_stats_atp.DATE_S)<37500))
GROUP BY tbl_G_stats_atp.InjuryCnt
ORDER BY Last(tbl_G_stats_atp.PK_G);
However it returns the first record (1203857).
I realise I can use this SQL as a replacement:
SELECT Max(tbl_G_stats_atp.PK_G) AS MaxOfPK_G, tbl_G_stats_atp.InjuryCnt
FROM tbl_G_stats_atp
WHERE (((tbl_G_stats_atp.ID_A)=89) AND ((tbl_G_stats_atp.DATE_S)<37500))
GROUP BY tbl_G_stats_atp.InjuryCnt;
However I'd like to understand why it's doing this. I may in future want to select the last record on a non-numeric field...
Have to be careful with using First or Last because records do not have intrinsic order. Even with an ORDER BY clause, results can be not as expected. I have avoided Last/First but just did a simple test and was able to return value from last record added to table - no WHERE, GROUP BY, or ORDER BY clauses included.
If you want to return all fields from that record, consider:
SELECT TOP 1 tbl_G_stats_atp.* FROM tbl_G_stats_atp WHERE ID_A=89 AND DATE_S<37500 ORDER BY PK_G DESC;
Even then, there must be a field of unique values that can be relied on to order records so desired record is brought to top. Usually an autonumber ID is positive and increasing (I've never seen otherwise) and should accomplish that. Or perhaps a date/time field will serve.
Related
According to documentations of MySQL I read SELECT executes before group by. I have a table named Views as follows and query
select distinct(viewer_id) as id
from Views v
group by viewer_id,view_date
having count(distinct(article_id))>1;
In this query if select is performed before group by according to documentation,how is it able to group by based on view_date as only viewer_id is selected. This has really confused me about how exact order of group by and select work.
+------------+-----------+-----------+------------+
| article_id | author_id | viewer_id | view_date |
+------------+-----------+-----------+------------+
| 1 | 3 | 5 | 2019-08-01 |
| 3 | 4 | 5 | 2019-08-01 |
| 1 | 3 | 6 | 2019-08-02 |
| 2 | 7 | 7 | 2019-08-01 |
| 2 | 7 | 6 | 2019-08-02 |
| 4 | 7 | 1 | 2019-07-22 |
| 3 | 4 | 4 | 2019-07-21 |
| 3 | 4 | 4 | 2019-07-21 |
+------------+-----------+-----------+------------+
There is no order to the evaluation. A SQL query describes the result set.
It is true that MySQL has a rather naive optimizer, so you can often see what the resulting query will be. But you should not think of the clauses as being evaluated in a particular order.
You might be confusing evaluation of the query with scoping rules. These affect how a particular identifier is determined.
You should not think in term of order of execution, but rather in terms of correctness of the statement. The select clause must be consistent with the group by clause: that is, any column that is present in the select clause and this is not part of an aggregate function must belong to the group by clause.
It is, on the other hand, perfectly valid to have columns in the group by clause that does not belong to the select clause - although the results might be a bit difficult to understand, because some information is missing about how the groups were built.
If we remove the distinct in the select clause, your query would phrase as:
select viewer_id as id
from views v
group by viewer_id, view_date
having count(distinct(article_id)) > 1;
This brings the viewer_ids for every view_date when they have more than one distinct article_id. A given viewer_id may appear more than once in the resultset, if they satisfied the condition on more than one date.
Then, distinct filters out duplicates viewer_ids: as a result, you get the list of viewers that viewed more one article on any date.
I have a table which looks like this
|Application No | Status | Amount | Type |
==========================================
|90909090 | Null | 3,000 | Null |
|90909090 | Forfeit| Null | A |
What I want to achieve is to combine the values together and end with a result like
|Application No | Status | Amount | Type |
==========================================
|90909090 | Forfeit| 3,000 | A |
I am new to SQL Query and have no idea how to do this
Thanks in advance
No need to join, use max() aggregate function and group by:
select applicationno, max(status), max(amount), max(type)
from yourtable
group by applicationno
However, if you have several non-null values for an application number in a field, then you may have to define a more granular rule than a simple aggregation via max.
i need to return the best 5 scores in each category from a table.so far i have tried query below following an example from this site: selecting top n records per group
query:
select
subject_name,substring_index(substring_index
(group_concat(exams_scores.admission_no order by exams_scores.score desc),',',value),',',-1) as names,
substring_index(substring_index(group_concat(score order by score desc),',',value),',',-1)
as orderedscore
from exams_scores,students,subjects,tinyint_asc
where tinyint_asc.value >=1 and tinyint_asc.value <=5 and exam_id=2
and exams_scores.admission_no=students.admission_no and students.form_id=1 and
exams_scores.subject_code=subjects.subject_code group by exams_scores.subject_code,value;
i get the top n as i need but my problem is that its returning duplicates at random which i dont know where they are coming from
As you can see English and Math have duplicates which should not be there
+------------------+-------+--------------+
| subject_name | names | orderedscore |
+------------------+-------+--------------+
| English | 1500 | 100 |
| English | 1500 | 100 |
| English | 2491 | 100 |
| English | 1501 | 99 |
| English | 1111 | 99 |
|Mathematics | 1004 | 100 |
| Mathematics | 1004 | 100 |
| Mathematics | 2722 | 99 |
| Mathematics | 2734 | 99 |
| Mathematics | 2712 | 99 |
+-----------------------------------------+
I have checked table and no duplicates exist
to confirm there are no duplicates in the table:
select * from exams_scores
having(exam_id=2) and (subject_code=121) and (admission_no=1004);
result :
+------+--------------+---------+--------------+-------+
| id | admission_no | exam_id | subject_code | score |
+------+--------------+---------+--------------+-------+
| 4919 | 1004 | 2 | 121 | 100 |
+------+--------------+---------+--------------+-------+
1 row in set (0.00 sec)
same result for English.
If i run the query like 5 times i sometimes end up with another field having duplicate values.
can anyone tell me why my query is behaving this way..i tried adding distinct inside
group_concat(ditinct(exams_scores.admission_no))
but that didnt work ??
You're grouping by exams_scores.subject_code, value. If you add them to your selected columns (...as orderedscore, exams_scores.subject_code, value from...), you should see that all rows are distinct with respect to these two columns you grouped by. Which is the correct semantics of GROUP BY.
Edit, to clarify:
First, the SQL server removes some rows according to your WHERE clause.
Afterwards, it groups the remaining rows according to your GROUP BY clause.
Finally, it selects the colums you specified, either by directly returning a column's value or performing a GROUP_CONCAT on some of the columns and returning their accumulated value.
If you select columns not included in the GROUP BY clause, the returned results for these columns are arbitrary, since the SQL server reduces all rows equal with respect to the columns specified in the GROUP BY clause to one single row - as for the remaining columns, the results are pretty much undefined (hence the "randomness" you're experiencing), because - what should the server choose as a value for this column? It can only pick one randomly from all the reduced rows.
In fact, some SQL servers won't perform such a query and return an SQL error, since the result for those columns would be undefined, which is something you don't want to have in general. With these servers (I believe MSSQL is one of them), you more or less can only have columns in you SELECT clause which are part of your GROUP BY clause.
Edit 2: Which, finally, means that you have to refine your GROUP BY clause to obtain the grouping that you want.
I need to sum a result that I'm getting from an existing query. And the it has to extend the current query and remain a single query
(by this I mean NOT - DO 1; DO 2; DO3;)
My current query is:
SELECT SUM((count)/(SELECT COUNT(*) FROM mobile_site_statistics WHERE campaign_id='1201' AND start_time BETWEEN CURDATE()-1 AND CURDATE())*100) AS percentage FROM mobile_site_statistics WHERE device NOT LIKE '%Pingdom%' AND campaign_id='1201' AND start_time BETWEEN (CURDATE()-1) AND CURDATE() GROUP BY device ORDER BY 1 DESC LIMIT 10;
This returns:
+------------+
| percentage |
+------------+
| 47.3813 |
| 19.7940 |
| 5.6672 |
| 5.0801 |
| 3.9603 |
| 3.8500 |
| 3.1294 |
| 2.9924 |
| 2.9398 |
| 2.7136 |
+------------+
What I need is the total of that table (total percent used by the top 10 devices)(that's all) but it has to be a single query (Has to include the initial query)(Has to be a single query due to another program that's using the query)
Is this possible? every way I have tried so far has failed. We tried temporary tables, but that turned into multiple queries.
Just do a
SELECT SUM(percentage) AS total FROM (<YOUR_QUERY>) a
and replace the sub-query <YOUR_QUERY> with your initial query
I just came across this database query and wonder what exactly this query does..Please clarify ..
select * from tablename order by priority='High' DESC, priority='Medium' DESC, priority='Low" DESC;
Looks like it'll order the priority by High, Medium then Low.
Because if the order by clause was just priority DESC then it would do it alphabetical, which would give
Medium
Low
High
It basically lists all fields from the table "tablename" and ordered by priority High, Medium, Low.
So High appears first in the list, then Medium, and then finally Low
i.e.
* High
* High
* High
* Medium
* Medium
* Low
Where * is the rest of the fields in the table
Others have already explained what id does (High comes first, then Medium, then Low). I'll just add a few words about WHY that is so.
The reason is that the result of a comparison in MySQL is an integer - 1 if it's true, 0 if it's false. And you can sort by integers, so this construct works. I'm not sure this would fly on other RDBMS though.
Added: OK, a more detailed explanation. First of all, let's start with how ORDER BY works.
ORDER BY takes a comma-separated list of arguments which it evalutes for every row. Then it sorts by these arguments. So, for example, let's take the classical example:
SELECT * from MyTable ORDER BY a, b, c desc
What ORDER BY does in this case, is that it gets the full result set in memory somewhere, and for every row it evaluates the values of a, b and c. Then it sorts it all using some standard sorting algorithm (such as quicksort). When it needs to compare two rows to find out which one comes first, it first compares the values of a for both rows; if those are equal, it compares the values of b; and, if those are equal too, it finally compares the values of c. Pretty simple, right? It's what you would do too.
OK, now let's consider something trickier. Take this:
SELECT * from MyTable ORDER BY a+b, c-d
This is basically the same thing, except that before all the sorting, ORDER BY takes every row and calculates a+b and c-d and stores the results in invisible columns that it creates just for sorting. Then it just compares those values like in the previous case. In essence, ORDER BY creates a table like this:
+-------------------+-----+-----+-----+-----+-------+-------+
| Some columns here | A | B | C | D | A+B | C-D |
+-------------------+-----+-----+-----+-----+-------+-------+
| | 1 | 2 | 3 | 4 | 3 | -1 |
| | 8 | 7 | 6 | 5 | 15 | 1 |
| | ... | ... | ... | ... | ... | ... |
+-------------------+-----+-----+-----+-----+-------+-------+
And then sorts the whole thing by the last two columns, which it discards afterwards. You don't even see them it your result set.
OK, something even weirder:
SELECT * from MyTable ORDER BY CASE WHEN a=b THEN c ELSE D END
Again - before sorting is performed, ORDER BY will go through each row, calculate the value of the expression CASE WHEN a=b THEN c ELSE D END and store it in an invisible column. This expression will always evaluate to some value, or you get an exception. Then it just sorts by that column which contains simple values, not just a fancy formula.
+-------------------+-----+-----+-----+-----+-----------------------------------+
| Some columns here | A | B | C | D | CASE WHEN a=b THEN c ELSE D END |
+-------------------+-----+-----+-----+-----+-----------------------------------+
| | 1 | 2 | 3 | 4 | 4 |
| | 3 | 3 | 6 | 5 | 6 |
| | ... | ... | ... | ... | ... |
+-------------------+-----+-----+-----+-----+-----------------------------------+
Hopefully you are now comfortable with this part. If not, re-read it or ask for more examples.
Next thing is the boolean expressions. Or rather the boolean type, which for MySQL happens to be an integer. In other words SELECT 2>3 will return 0 and SELECT 2<3 will return 1. That's just it. The boolean type is an integer. And you can do integer stuff with it too. Like SELECT (2<3)+5 will return 6.
OK, now let's put all this together. Let's take your query:
select * from tablename order by priority='High' DESC, priority='Medium' DESC, priority='Low" DESC;
What happens is that ORDER BY sees a table like this:
+-------------------+----------+-----------------+-------------------+----------------+
| Some columns here | priority | priority='High' | priority='Medium' | priority='Low' |
+-------------------+----------+-----------------+-------------------+----------------+
| | Low | 0 | 0 | 1 |
| | High | 1 | 0 | 0 |
| | Medium | 0 | 1 | 0 |
| | Low | 0 | 0 | 1 |
| | High | 1 | 0 | 0 |
| | Low | 0 | 0 | 1 |
| | Medium | 0 | 1 | 0 |
| | High | 1 | 0 | 0 |
| | Medium | 0 | 1 | 0 |
| | Low | 0 | 0 | 1 |
+-------------------+----------+-----------------+-------------------+----------------+
And it then sorts by the last three invisble columns which are discarded later.
Does it make sense now?
(P.S. In reality, of course, there are no invisible columns and the whole thing is made much trickier to get good speed, using indexes if possible and other stuff. However it is much easier to understand the process like this. It's not wrong either.)