mySQL statement get last record - mysql

I'm trying to get the total of votes for each FANOF_ID (ex: Me). The problem is that a FAN can vote each day for the same FANOF_ID (ex: David Bowie RIP)
So each day I could vote for David Bowie as my favorite singer
ID CREATED FAN_ID FANOF_ID
15 2016-01-24 3 3
16 2016-01-25 3 3
17 2016-01-25 2 3
So from that example I should get a result of 2 fans for 'total' for FANOF_ID (3)
This is my actual SQL
SELECT
distinct `fans_fanofvote`.`fan_id`,
COUNT(`fans_fanofvote`.`fanof_id`) AS `total`
FROM `fans_fanofvote`
GROUP BY `fans_fanofvote`.`fanof_id`
ORDER BY `total` DESC
But it returns 3 records even if I use distinct on fan_id it wont work. How can I get mySQL to do a distinct on FAN_ID
My SQL should return one record like that:
FANOF_ID TOTAL
3 2

You want COUNT(DISTINCT). However, you have to be careful about what you are counting (fan_id) and what you are aggregating by (fanof_id):
SELECT fov.fanof_id,
COUNT(DISTINCT fov.fan_id) AS total
FROM fans_fanofvote fov
GROUP BY fov.fanof_id
ORDER BY total DESC;
Note that table aliases make the query easier to read. And don't use back tick unless really needed.

You didn't mention expected output earlier so it was confusing.
SELECT
`fans_fanofvote`.`fanof_id`,
COUNT(`fans_fanofvote`.`fan_id`) AS `total`
FROM `fans_fanofvote`
GROUP BY `fans_fanofvote`.`fanof_id`
ORDER BY `total` DESC

Use count(distinct )
SELECT
COUNT(distinct `fans_fanofvote`.`fanof_id`) AS `total`
FROM `fans_fanofvote`
GROUP BY `fans_fanofvote`.`fanof_id`
ORDER BY `tot

Related

get AVG() after GROUP BY in MYSQL

I just start to learn MYSQL and meet a problem like this
So the table is like this:
id name moneySpent
1 Alex 3
2 Alex 1
3 Bill 4
4 Alex 2
5 Alex 1
6 Chris 5
7 Chris 3
Lets say I wanna know the Average money spent per person. I try to do that by using SUM() GROUP BY and AVG() but I got stuck at AVG()
SELECT name, sum(moneySpent) AS total FROM table GROUP BY name;
then this will return
name total
Alex 7
Bill 4
Chris 8
Then how can I get a (7+4+8)/3 using AVG()?
You can get average per person using:
SELECT AVG(total) AS AVERAGE
FROM (SELECT name, sum(moneySpent) AS total
FROM table GROUP BY name) A
;
Output:
AVERAGE
6,3333
You can use inner query to get sum and outer query to derive average from sum as below.
SELECT Avg(sum1) FROM (
SELECT Sum(amount) AS sum1
FROM table1
GROUP BY NAME
) T1
It will generate below output.
AVERAGE_AMOUNT_SPENT
------------------
6.3333
which is what you want to be the output i.e. (7+4+8)/3 = 6.333
You can check demo here
So there are 2 ways to do this, first is use a new table to store the SELECT result. It is much more esay but may take more space.
Second is by jarlh, It comes to me that I do not need to GROUP BY the whole table, I can just add all moneySpent up and divided by distinct name count.
Thanks people!
select avg(total) as average from (SELECT name, sum(moneySpent) AS total FROM table GROUP BY name);
You can use this query to get your desired output
OUTPUT:
AVERAGE
6.3333333333333333

How to get the number of days between two dates

I have a situation where I need to find the number of days that have passed between two rows with date fields. The rows that the calculation needs to be made are not sorted.
Here is the structure of the table
Folio DATE
1 6/1/2015
2 4/1/2015
1 3/1/2015
4 2/1/2015
1 1/1/2015
Basically, I would need to sort by date and keep only the last two transactions grouped by folio. so in this example, the transaction by folio 1 on 1/1/2015 would be ignored.
Suppose that I need to do the following:
1. Group by folio number
2. only count the days between the last two transactions by folio. For example, folio #1 would only include the transactions from 6/1/2015 and 3/1/2015.
The result I'm looking for:
Folio FirstDATE LastDate #ofDays
1 3/1/2015 6/1/2015 90
Any MySQL pros out there? My skills are still in newbie territory. Thank you!
UPDATE:
I've managed to come up with the following:
SELECT
SubQuery.`Folio Number`,
SubQuery.LatestClosing,
SubQuery.FirstClosing,
DATEDIFF(SubQuery.LatestClosing, SubQuery.FirstClosing) AS numofdays
FROM (SELECT
Subquery.`Folio Number`,
SubQuery.LatestClosing,
SubQuery.FirstClosing
FROM (SELECT t.`Folio Number`,
MAX(t.`Closing Date`) AS LatestClosing,
(SELECT
s.`Closing Date`
FROM MLSFinalimport s
WHERE t.`Folio Number` = s.`Folio Number`
ORDER BY s.`Closing Date` DESC
LIMIT 1, 1) AS FirstClosing,
FROM MLSFinalimport t
GROUP BY t.`Folio Number`) SubQuery) SubQuery
This is generating a result that looks like this:
LatestClosing First Closing numofdays
7/20/2016 5/9/2006 3725
This is what I need. However, I'm stuck trying to add the original column for each row called "Folio Number". How do I proceed?
Thank you very much.
Pros for MySQL at this? Probably the opposite.. MySQL doesn't support window functions so you can try using a correlated query with LIMIT/OFFSET:
SELECT p.folio,p.max_d,p.second_d,DATEDIFF(p.max_d,p.second_d) as NumOfDays
FROM (
SELECT t.folio,MAX(t.date) as max_d,
(SELECT s.date FROM YourTable s
WHERE t.folio = s.folio
ORDER BY s.date DESC
LIMIT 1,1) as second_d
FROM YourTable t
GROUP BY t.folio) p

Adding Row Values when there are no results - MySQL

Problem Statement: I need my result set to include records that would not naturally return because they are NULL.
I'm going to put some simplified code here since my code seems to be too long.
Table Scores has Company_type, Company, Score, Project_ID
Select Score, Count(Project_ID)
FROM Scores
WHERE company_type= :company_type
GROUP BY Score
Results in the following:
Score Projects
5 95
4 94
3 215
2 51
1 155
Everything is working fine until I apply a condition to company_type that does not include results in one of the 5 score categories. When this happens, I don't have 5 rows in my result set any more.
It displays like this:
Score Projects
5 5
3 6
1 3
I'd like it to display like this:
Score Projects
5 5
4 0
3 6
2 0
1 3
I need the results to always display 5 rows. (Scores = 1-5)
I tried one of the approaches below by Spencer7593. My simplified query now looks like this:
SELECT i.score AS Score, IFNULL(count(*), 0) AS Projects
FROM (SELECT 5 AS score
UNION ALL
SELECT 4
UNION ALL
SELECT 3
UNION ALL
SELECT 2
UNION ALL
SELECT 1) i
LEFT JOIN Scores ON Scores.score = i.score
GROUP BY Score
ORDER BY i.score DESC
And gives the following results, which is accurate except that the rows with 1 in Projects should actually be 0 because they are derived by the "i". There are no projects with a score of 5 or 2.
Score Projects
5 1
4 5
3 6
2 1
1 3
Solved! I just needed to adjust my count to specifically look at the project count - count(project) rather than count(*). This returned the expected results.
If you always want your query to return 5 rows, with Score values of 5,4,3,2,1... you'll need a rowsource that supplies those Score values.
One approach would be to use a simple query to return those fixed values, e.g.
SELECT 5 AS score
UNION ALL SELECT 4
UNION ALL SELECT 3
UNION ALL SELECT 2
UNION ALL SELECT 1
Then use that query as inline view, and do an outer join operation to the results from your current query
SELECT i.score AS `Score`
, IFNULL(q.projects,0) AS `Projects`
FROM ( SELECT 5 AS score
UNION ALL SELECT 4
UNION ALL SELECT 3
UNION ALL SELECT 2
UNION ALL SELECT 1
) i
LEFT
JOIN (
-- the current query with "missing" Score rows goes here
-- for completeness of this example, without the query
-- we emulate that result with a different query
SELECT 5 AS score, 95 AS projects
UNION ALL SELECT 3, 215
UNION ALL SELECT 1, 155
) q
ON q.score = i.score
ORDER BY i.score DESC
It doesn't have to be the view query in this example. But there does need to be a rowsource that the rows can be returned from. You could, for example, have a simple table that contains those five rows, with those five score values.
This is just an example approach for the general approach. It might be possible to modify your existing query to return the rows you want. But without seeing the query, the schema, and example data, we can't tell.
FOLLOWUP
Based on the edit to the question, showing an example of the current query.
If we are guaranteed that the five values of Score will always appear in the Scores table, we could do conditional aggregation, writing a query like this:
SELECT s.score
, COUNT(IF(s.company_type = :company_type,s.project_id,NULL)) AS projects
FROM Scores s
GROUP BY s.score
ORDER BY s.score DESC
Note that this will require a scan of all the rows, so it may not perform as well. The "trick" is the IF function, which returns a NULL value in place of project_id, when the row would have been excluded by the WHERE clause.)
If we are guaranteed that project_id is non-NULL, we could use a more terse MySQL shorthand expression to achieve an equivalent result...
, IFNULL(SUM(s.company_type = :company_type),0) AS projects
This works because MySQL returns 1 when the comparison is TRUE, and otherwisee returns 0 or NULL.
Try something like this:
select distinct score
from (
select distinct score from scores
) s
left outer join (
Select Score, Count(Project_ID) cnt
FROM Scores
WHERE company_type= :company_type
) x
on s.score = x.score
Your posted query would not work without a group by statement. However, even there, if you don't have those particular scores for that company type, it wouldn't work either.
One option is to use an outer join. That would require a little more work though.
Here's another option using conditional aggregation:
select Score, sum(company_type=:company_type)
from Scores
group by Score

How can I write a query that aggregate a single row with latest date among multiple set of rows?

I have a MySQL table where there are many rows for each person, and I want to write a query which aggregates rows with special constraint. (one per person)
For example, lets say the table is consist of following data.
name date reason
---------------------------------------
John 2013-04-01 14:00:00 Vacation
John 2013-03-31 18:00:00 Sick
Ted 2012-05-06 20:00:00 Sick
Ted 2012-02-20 01:00:00 Vacation
John 2011-12-21 00:00:00 Sick
Bob 2011-04-02 20:00:00 Sick
I want to see the distribution of 'reason' column. If I just write a query like below
select reason, count(*) as count from table group by reason
then I will be able to see number of reasons for this table overall.
reason count
------------------
Sick 4
Vacation 2
However, I am only interested in single reason from each person. The reason that should be counted should be from a row with latest date from the person's records. For example, John's latest reason would be Vacation while Ted's latest reason would be Sick. And Bob's latest reason (and the only reason) is Sick.
The expected result for that query should be like below. (Sum of count will be 3 because there are only 3 people)
reason count
-----------------
Sick 2
Vacation 1
Is it possible to write a query such that single latest reason will be counted when I want to see distribution(count) of reasons?
Here are some facts about the table.
The table has tens of millions of rows
For most of times, each person has one reason.
Some people have multiple reasons, but 99.99% of people have fewer than 5 reasons.
There are about 30 different reasons while there are millions of distinct names.
The table is partitioned based on date range.
SELECT T.REASON, COUNT(*)
FROM
(
SELECT PERSON, MAX(DATE) AS MAX_DATE
FROM TABLE-NAME
GROUP BY PERSON
) A, TABLE-NAME T
WHERE T.PERSON = A.PERSON AND T.DATE = A.MAX_DATE
GROUP BY T.REASON
Try this
select reason, count(*) from
(select reason from table where date in
(select max(date) from table group by name)) t
group by reason
In MySQL, it's not very efficient to do this kind of query since you don't have access to tools like partitionning query in SQL Server or Oracle.
You can still emulate it by doing a subquery and retrieve the rows based on the condition you need, here the maximum date :
SELECT t.reason, COUNT(1)
FROM
(
SELECT name, MAX(adate) AS maxDate
FROM #aTable
GROUP BY name
) maxDateRows
INNER JOIN #aTable t ON maxDateRows.name = t.name
AND maxDateRows.maxDate = t.adate
GROUP BY t.reason
You can see a sample here.
Test this query on your samples, but I'm afraid that it will be slow as hell.
For your information, you can do the same thing in a more elegant and much much faster way in SQL Server :
SELECT reason, COUNT(1)
FROM
(
SELECT name
, reason
, RANK() OVER(PARTITION BY name ORDER BY adate DESC) as Rank
FROM #aTable
) AS rankTable
WHERE Rank = 1
GROUP BY reason
The sample is here
If you are really stuck to MySql, and the first query is too slow, then you can split the problem.
Do a first query creating a table:
CREATE TABLE maxDateRows AS
SELECT name, MAX(adate) AS maxDate
FROM #aTable
GROUP BY name
Then create index on both name and maxDate.
Finally, get the results :
SELECT t.reason, COUNT(1)
FROM maxDateRows m
INNER JOIN #aTable t ON m.name = t.name
AND m.maxDate = t.adate
GROUP BY t.reason
The solution you are looking for seems to be solved by this query :
select
reason,
count(*)
from (select * from tablename group by name) abc
group by
reason
It is quite fast and simple. You can view the SQL Fiddle
Apologies if this answer duplicates an existing. Maybe I'm suffering from some form aphasia but I cannot see it...
SELECT x.reason
, COUNT(*)
FROM absentism x
JOIN
( SELECT name,MAX(date) max_date FROM absentism GROUP BY name) y
ON y.name = x.name
AND y.max_date = x.date
GROUP
BY reason;

MySQL check if MAX value has duplicates

I'm running contests on my website. Every contest could have multiple entries. I want to retrieve if only the MAX value of votes has a duplicate.
The table is as follows:
contest_id entry_id votes
1 1 50
1 2 34
1 3 50
2 4 20
2 5 55
3 6 53
I just need the query to show me that contest 1 has a duplicate MAX value without additional information.
I tried this but didn't work:
SELECT MAX(votes) from contest group by contest_id having count(votes) > 1
SELECT a.contest_ID
FROM contest a
INNER JOIN
(
SELECT contest_id, MAX(votes) totalVotes
FROM contest
GROUP BY contest_id
) b ON a.contest_ID = b.contest_ID AND
a.votes = b.totalvotes
GROUP BY a.contest_ID
HAVING COUNT(*) >= 2
SQLFiddle Demo
This finds the max votes value per contest and counts the entries with that number of votes.
It then displays contest with more than one hit.
SELECT contest_id
FROM contests
WHERE votes=(
SELECT MAX(votes) FROM contests c WHERE c.contest_id=contests.contest_id
)
GROUP BY contest_id
HAVING COUNT(*) > 1;
SQLfiddle for testing.
You could do it by first selecting the maximum number of votes for each contest ID in a subquery, and then joining against the results (demo on SQLFiddle):
SELECT contest_id, votes
FROM contest
JOIN (
SELECT contest_id, MAX(votes) AS votes
FROM contest GROUP BY contest_id
) AS foo USING (contest_id, votes)
GROUP BY contest_id
HAVING COUNT(*) > 1
The nice thing about doing it like this is that it's an independent subquery, so MySQL only needs to rub it once.
Ps. Yes, this is basically identical to JW's answer, but I figured I'd leave it up anyway to show the slightly different syntax I used for the join.