ORDER BY does not work if COUNT is used - mysql

I have a table with following content
loan_application
+----+---------+
| id | user_id |
+----+---------+
| 1 | 10 |
| 2 | 10 |
| 3 | 10 |
+----+---------+
I want to fetch 3rd record only if there are 3 records available, in this case i want id 3 and total count must be 3, here is what i expect
+--------------+----+
| COUNT(la.id) | id |
+--------------+----+
| 3 | 3 |
+--------------+----+
Here is the query i tried.
SELECT COUNT(la.id), la.id FROM loan_application la HAVING COUNT(la.id) = 3 ORDER BY la.id DESC;
However this gives me following result
+--------------+----+
| COUNT(la.id) | id |
+--------------+----+
| 3 | 1 |
+--------------+----+
The problem is that it returns id 1 even if i use order by id descending, whereas i am expecting the id to have value of 3, where am i going wrong ?
Thanks.

In your case u can use this query:
SELECT COUNT(la.id), max(la.id) FROM loan_application la
GROUP BY user_id
I try your table in my db MySQL

When you have a group by function (in this instance count()) in the select list without a group by clause, then mysql will return a single record only with the function applied to the whole table.
Mysql under certain configuration settings allow you to include fields in the select loist which are not in the group by clause, nor are aggregated. Mysql pretty much picks up the 1st value it encounters while scanning the data as a value for such fields, in your case the value 1 for id.
If you want to fetch the record where id=count of records within the table, then I would use the following query:
select *
from loan_application
join (select count(*) as numrows from loan_application) t
where id=t.numrows and t.numrows=3
However, this implies that the values within the id field are continuous and there are no gaps.

You are selecting la.id along with an aggregated function (COUNT). So after iterating the first record the la.id is selected but the count goes on. So in this case you will get the first la.id not the last. In order to get the last la.id you need to use the max function on that field.
Here's the updated query:
SELECT
COUNT(la.id),
MAX(la.id)
FROM
loan_application la
GROUP BY user_id
HAVING
COUNT(la.id) = 3
N:B: You are using COUNT without a GROUP BY Function. So this particular aggregated function is applied to the whole table.

Related

MySQL group/order behaves differently in 5.7

I have a table that looks like this:
id | text | language_id | other_id | dateCreated
1 | something | 1 | 5 | 2015-01-02
2 | something | 1 | 5 | 2015-01-01
3 | something | 2 | 5 | 2015-01-01
4 | something | 2 | 6 | 2015-01-01
and I want to get all latest rows for each language_id that have other_id 5.
my query looks like this
SELECT * (
SELECT *
FROM tbl
WHERE other_id = 5
ORDER BY dateCreated DESC
) AS r
GROUP BY r.language_id
With MySQL 5.6 I get 2 rows with ID 1 and 3, which is what I want.
With MySQL 5.7.10 I get 2 rows with IDs 2 and 3 and it seems to me that the ORDER BY in the subquery is ignored.
Any ideas what might be the problem ?
You should go with the query below:
SELECT
*
FROM tbl
INNER JOIN
(
SELECT
other_id,
language_id,
MAX(dateCreated) max_date_created
FROM tbl
WHERE other_id = 5
GROUP BY language_id
) AS t
ON tbl.language_id = t.language_id AND tbl.other_id = t.other_id AND
tbl.dateCreated = t.max_date_created
Using GROUP BY without aggregate function will pick row in arbitrary order. You should not rely on what's row is returned by the GROUP BY. MySQL doesn't ensure this.
Quoting from this post
In a nutshell, MySQL allows omitting some columns from the GROUP BY,
for performance purposes, however this works only if the omitted
columns all have the same value (within a grouping), otherwise, the
value returned by the query are indeed indeterminate, as properly
guessed by others in this post. To be sure adding an ORDER BY clause
would not re-introduce any form of deterministic behavior.
Although not at the core of the issue, this example shows how using *
rather than an explicit enumeration of desired columns is often a bad
idea.
Excerpt from MySQL 5.0 documentation:
When using this feature, all rows in each group should have the same
values for the columns that are omitted from the GROUP BY part. The
server is free to return any value from the group, so the results are
indeterminate unless all values are the same.

Sum columns depending on another column value

I'm having trouble with summing the fields values based on another fields value.
I need to SUM(activities.points) based on activities.activity_type if it's used_points or added_points and put it in AS used_points/added_points.
Table activities:
id | subscription_id | activity_type | points
--------------------------------------------------
1 | 1 | used_points | 10
2 | 1 | used_points | 50
3 | 1 | added_points | 20
4 | 1 | added_points | 30
5 | 2 | used_points | 20
6 | 2 | used_points | 45
7 | 2 | added_points | 45
8 | 2 | added_points | 45
Table subscriptions:
id | name | current_points
-------------------------------------
1 | card_1 | 700
2 | card_2 | 900
What I need:
name | current_points | used_points | added_points
-----------------------------------------------------------
card_1 | 700 | 60 | 50
card_2 | 900 | 65 | 90
What I tried :
SELECT
subscriptions.name,
subscriptions.current_points,
IF(activities.activity_type="used_points", SUM(activities.points), null)
AS used_points,
IF(activities.activity_type="added_points", SUM(activities.points), null)
AS added_points
FROM activities
JOIN subscriptions
ON activities.subscription.id = subscription.id
GROUP BY subscriptions.name
Which is wrong.
Thanks
You want to use SUM(IF( )). You want to add up the values returned from the IF. You want that IF expression to be evaluated for each individual row. Then, use the SUM aggregate to add up the value returned for each row.
Remove the SUM aggregate from inside the IF expression and instead, wrap the IF inside a SUM.
Followup
Q But why SUM() inside of IF doesn't work ?
A Well, it does work. It's just not working the way you want it to work.
The MySQL SUM function is an "aggregate" function. It aggregates rows together, and returns a single value.
For an expression of this form: IF(col='foo',SUM(numcol),0)
What MySQL is doing is aggregating all the rows into the SUM, and returning a single value.
Other databases would pitch a fit, and throw an error with the reference to the non-aggregate col in that expression. MySQL is more lenient, and treats the col reference like it was an aggregate (like MIN(col), or MAX(col)... working on a group of rows, and returning a single value. In this case, MySQL is selecting a single, sample row. (It's not determinate which row will be "chosen" as the sample row.) So that reference to col is sort of like a GET_VALUE_FROM_SAMPLE_ROW(col). Once the aggregates are completed, then that IF expression gets evaluated once.
If you start with this query, this is the set of rows you want to operate on.
SELECT s.name
, s.current_points
, a.activity_type
, a.points
, IF(a.activity_type='used_points',a.points,NULL) AS used_points
, IF(a.activity_type='added_points',a.points,NULL) AS added_points
FROM subscriptions s
JOIN activities a
ON a.subscription_id = s.id
When you add a GROUP BY clause, that's going to aggregate some of those rows together. What you will get back for the non-aggregates is values from a sample row.
Try adding GROUP BY s.name to the query, and see what is returned.
Also try adding in some aggregates, such as SUM(a.points)
SELECT s.name
, s.current_points
, a.activity_type
, a.points
, IF(a.activity_type='used_points',a.points,NULL) AS used_points
, IF(a.activity_type='added_points',a.points,NULL) AS added_points
, SUM(a.points) AS total_points
FROM subscriptions s
JOIN activities a
ON a.subscription_id = s.id
GROUP BY s.name
Finally, we can add in the expressions in your query into the SELECT list:
, IF(a.activty_type='used_points',SUM(a.points),NULL) AS if_used_sum
, IF(a.activty_type='added_points',SUM(a.points),NULL) AS if_added_sum
What we discover is that the value returned from these expressions will either be SUM(a.points), which will match the total_points, or it will be NULL. And we can see the value of the activity_type column, retrieved from a single, sample row for each group, and we can see that this is expression is "working", it's just not doing what we you really want to happen: for the conditional test to run on each individual row, returning a value for points or a null, and then summing that up for the group.
Your code is only slightly out:
SELECT
subscriptions.name,
subscriptions.current_points,
SUM(IF(activities.activity_type="used_points", 0, activities.points))
AS used_points,
SUM(IF(activities.activity_type="added_points", 0, activities.points))
AS added_points
FROM activities
JOIN subscriptions
ON activities.subscription_id = subscription.id
GROUP BY subscriptions.name, subscriptions.current_points
Note the fixed typo in the second last line - you wrote subscription.id instead of subscription_id. Also you only grouped by name instead of name and current_points, not sure if that's allowed in mysql (I use T-SQL), it's good practice to have it there anyway.
Well, I did it not using the IF statement. Here's the example (http://sqlfiddle.com/#!2/076c3f/12):
SELECT
subs.name,
subs.current_points,
(SELECT SUM(points) FROM activities WHERE type = 1 AND subs_id = subs.id) AS used_points,
(SELECT SUM(points) FROM activities WHERE type = 2 AND subs_id = subs.id) AS added_points
FROM activities
JOIN subs ON activities.id = subs.id
GROUP BY subs.name
NOTE: I changed the type from VARCHAR to INT to simplify.
Try change
IF(activities.activity_type="used_points", null, SUM(activities.points))
AS used_points,
IF(activities.activity_type="added_points", null, SUM(activities.points))
AS added_points
To next
SUM(IF(activities.activity_type="used_points", activities.points, 0))
AS used_points,
SUM(IF(activities.activity_type="added_points", activities.points, 0))
AS added_points
In this way you check column and sum points or 0
To sum a column of integer values(c1) based on another column of character values(c2). And if you need to sum only not null values, the below code will help.
SELECT SUM(c1) FROM table_name WHERE c2 <> '' AND c2 IS NOT NULL

how to find duplicate count without counting original

I need to count the number of duplicate emails in a mysql database, but without counting the first one (considered the original). In this table, the query result should be the single value "3" (2 duplicate x#q.com plus 1 duplicate f#q.com).
TABLE
ID | Name | Email
1 | Mike | x#q.com
2 | Peter | p#q.com
3 | Mike | x#q.com
4 | Mike | x#q.com
5 | Frank | f#q.com
6 | Jim | f#q.com
My current query produces not one number, but multiple rows, one per email address regardless of how many duplicates of this email are in the table:
SELECT value, count(lds1.leadid) FROM leads_form_element lds1 LEFT JOIN leads lds2 ON lds1.leadID = lds2.leadID
WHERE lds2.typesID = "31" AND lds1.formElementID = '97'
GROUP BY lds1.value HAVING ( COUNT(lds1.value) > 1 )
It's not one query so I'm not sure if it would work in your case, but you could do one query to select the total number of rows, a second query to select distinct email addresses, and subtract the two. This would give you the total number of duplicates...
select count(*) from someTable;
select count(distinct Email) from someTable;
In fact, I don't know if this will work, but you could try doing it all in one query:
select (count(*)-(count(distinct Email))) from someTable
Like I said, untested, but let me know if it works for you.
Try doing a group by in a sub query and then summing up. Something like:
select sum(tot)
from
(
select email, count(1)-1 as tot
from table
group by email
having count(1) > 1
)

MySQL join query or sub query

I'm trying to do a query that selects mike if it isn't in the three highest bids for a keyword. Rows 4 and 7 should be selected.
So in final, if mike isn't in the three highest bids for a keyword, then select.
How do I solve this? With a sub query?
$construct = "SELECT child.* FROM `temp-advertise` child
LEFT JOIN `temp-advertise` parent on child.keyword=parent.keyword
WHERE child.name='mike'
ORDER BY child.id DESC";
id | name| keyword | bid |
1 | mike| one | 7 |
2 | tom | one | 4 |
3 | ced | one | 6 |
4 | mike| two | 1 |
5 | tom | two | 5 |
6 | har | two | 5 |
7 | mike| one | 3 |
8 | har | two | 3 |
SELECT *
FROM `temp-advertise` ta
WHERE ta.keyword = 'one'
AND ta.name = 'mike'
AND ta.bid <
(
SELECT bid
FROM `temp-advertise` tai
WHERE tai.keyword = 'one'
ORDER BY
bid DESC
LIMIT 2, 1
)
Your structure doesn't look too promising, nor your sample data. However, that said, you want to know if "Mike" was in the top 3 per keyword... and that he has 3 bids.... 2 for "one", 1 for "two". From the raw data, it looks like Mike is in 1st place and 4th place for the "one" keyword, and 4th place for "two" keyword.
This should get you what you need with SOME respect to not doing a full query of all keywords. The first innermost query is to just get keywords bid on by "mike" (hence alias "JustMike"). Then join that to the temp-advertise on ONLY THOSE keywords.
Next, by using MySQL variables, we can keep track of the rank PER KEYWORD. The trick is the ORDER BY clause needs to return them in the order that represents proper ranking. In this case, each keyword first, then within each keyword, ordered by highest bid first.
By querying the records, then using the #variables, we increase the counter, start at 1 every time the keyword changes, then preserve the keyword into the #grpKeyword variable for comparison of the next record. Once ALL bids are processed for the respective keywords, it then queries THAT result but ONLY for those bid on by "mike". These records will have whatever his rank position was.
select RankPerKeyword.*
from
( SELECT ta.*,
#grpCnt := if( #grpKeyword = ta.Keyword, #grpCnt +1, 1 ) as KWRank,
#grpKeyword := ta.Keyword as carryForward
FROM
( select distinct ta1.keyword
from `temp-advertise` ta1
where ta1.name = "mike" ) as JustMike
JOIN `temp-advertise` ta
on JustMike.Keyword = ta.Keyword,
( select #grpCnt := 0,
#grpKeyword := '' ) SqlVars
ORDER BY
ta.Keyword,
ta.Bid DESC" ) RankPerKeyword
where
RankPerKeyword.name = "mike"
(Run above to just preview the results... should show 3 records)
So, if you want to know if it was WITHIN the top 3 for a keyword you could just change to
select RankPerKeyword.keyword, MIN( RankPerKeyword.KWRank ) as BestRank
from (rest of query)
group by RankPerKeyword.Keyword
Try this:
Select ID, name, keyword from temp-advertise e
where 3 <= (select count(name) from temp-advertise
where e.keyword = keyword and bid > e.bid)
Try
SELECT .. ORDER BY bid LIMIT 3,999

Group by - Overriding default behaviour of deciding row under each group in result

Extending further from this question Query to find top rated article in each category -
Consider the same table -
id | category_id | rating
---+-------------+-------
1 | 1 | 10
2 | 1 | 8
3 | 2 | 7
4 | 3 | 5
5 | 3 | 2
6 | 3 | 6
There is a table articles, with fields id, rating (an integer from 1-10), and category_id (an integer representing to which category it belongs). And if I have the same goal to get the top rated articles in each query (this should be the result):-
Desired Result
id | category_id | rating
---+-------------+-------
1 | 1 | 10
3 | 2 | 7
6 | 3 | 6
Extension of original question
But, running the following query -
SELECT id, category_id, max( rating ) AS max_rating
FROM `articles`
GROUP BY category_id
results into the following where everything, except the id field, is as desired. I know how to do this with a subquery - as answered in the same question - Using subquery.
id category_id max_rating
1 1 10
3 2 7
4 3 6
In generic terms
Excluding the grouped column (category_id) and the evaluated columns (columns returning results of aggregate function like SUM(), MAX() etc. - in this case max_rating), the values returned in the other fields are simply the first row under every grouped result set (grouped by category_id in this case). E.g. the record with id =1 is the first one in the table under category_id 1 (id 1 and 2 under category_id 1) so it is returned.
I am just wondering is it not possible to somehow overcome this default behavior to return rows based on conditions? If mysql can perform calculation for every grouped result set (does MAX() counting etc) then why can't it return the row corresponding to the maximum rating. Is it not possible to do this in a single query without a subquery? This looks to me like a frequent requirement.
Update
I could not figure out what I want from Naktibalda's solution too. And just to mention again, I know how to do this using a subquery, as again answered by OMG Ponies.
Use:
SELECT x.id,
x.category_id,
x.rating
FROM YOUR_TABLE x
JOIN (SELECT t.category_id,
MAX(t.rating) AS max_rating
FROM YOUR_TABLE t
GROUP BY t.category_id) y ON y.category_id = x.category_id
AND y.max_rating = x.rating