How to use count after using group by and count in sql? - mysql

I am trying to see statistics of how many passenger passed from my application. After selecting this query;
select count(person_id), person_id from Passenger group by person_id;
count(person_id) person_id
6 123
2 421
1 542
3 612
1 643
2 876
I see that passenger "123" passed 6 times. "421" passed 2 times. "542" passed 1 times etc.. So I want to make analyze and say that;
> 1 passenger passed 6 times,
> 2 passenger passed 2 times,
> 2 passenger passed 1 times,
> 1 passenger passed 3 times..
Here is sqlFiddle for your better understanding..

You can use a SELECT with a subquery to obtain the result you want:
SELECT Concat(COUNT(*), ' passenger passed ', table.theCount, ' times,') FROM
(
SELECT COUNT(person_id) AS theCount, person_id
FROM Passenger
GROUP BY person_id
) table
GROUP BY table.theCount

select cnt, count(person_id)
from
(
select count(person_id) as cnt, person_id
from Passenger
group by person_id
) tmp
group by cnt

Is this what you are looking for?
select count(person_id) as "Num passengers", times
from (
select count(person_id) as times, person_id
from Passenger
group by person_id
) sub
group by times order by times ASC

Related

Sum unequal and removing duplicates from SQL query results

My base query:
SELECT project_id
name
stories_produced
on_date
FROM project_prod
WHERE on_date IN ('2017-03-01', '2017-06-10')
ORDER BY project_id
It can get me these outputs:
Output example:
id name stories_produced on_date
1042 project 1 1001 (wanted) 2017-03-01
1042 project 1 1801 (wanted) 2017-06-10
1568 project 2 355 (wanted) 2017-06-10
1405 project 3 1 (not wanted) 2017-03-10
1405 project 3 1 (not wanted) 2017-06-10
Obs: There is a constraint on (id, on_date) meaning there can always be only one record of a project production on a specific date.
Duplicate records, that have the same id, and exist in both dates and have different production values (wanted)
Single records, that exists on only one of the dates (wanted)
The problem:*
Duplicate records, that have the same id, and exist in both dates and have equal production values (not wanted)
My current query, that need change
select project_id
name
CASE
WHEN max(stories_produced) - min(stories_produced) = 0
THEN max(stories_produced)
ELSE max(stories_produced) - min(stories_produced)
END AS 'stories_produced'
from project_prod
WHERE on_date IN ('2017-03-01', '2017-06-10')
group by project_id;
output example:
id name stories_produced
1042 project 1 800 (wanted)
1568 project 2 355 (wanted)
1405 project 3 1 (not wanted)
The CASE is currently not taking care of the third constraint (Duplicate records, that have the same id, and exist in both dates and have EQUAL production values (not wanted))
Is there any possible condition that can accommodate this?
One option uses not exists to drop rows that have the same id, and exist in both dates and have equal production values:
select
p.project_id,
p.name,
p.stories_produced,
p.on_date,
from project_prod p
where
on_date in ('2017-03-01', '2017-06-10')
and not exists (
select 1
from project_prod p1
where
p1.on_date in ('2017-03-01', '2017-06-10')
and p1.on_date <> p.date
and p1.id = p.id
and p1.stories_produced = p.stories_produced
)
order by project_id
In MySQL 8.0, you can use window functions:
select
project_id,
name,
stories_produced,
on_date,
from (
select
p.*,
min(stories_produced) over(partition by project_id) min_stories_produced,
max(stories_produced) over(partition by project_id) max_stories_produced,
count(*) over(partition by project_id) max_stories_produced cnt
from project_prod p
where on_date in ('2017-03-01', '2017-06-10')
) t
where not (cnt = 2 and min_stories_produced = max_stories_produced)
oder by project_id

configure query to bring rows which have more than 1 entries

How to get those entries which have more than 1 records?
If it doesn't make sense... let me explain:
From the below table I want to access the sum of the commission of all rows where type is joining and "they have more than 1 entry with same downmem_id".
I have this query but it doesn't consider more entries scenario...
$search = "SELECT sum(commission) as income FROM `$database`.`$memcom` where type='joining'";
Here's the table:
id mem_id commission downmem_id type time
2 1 3250 2 joining 2019-09-22 13:24:40
3 45 500 2 egbvegr new time
4 32 20 2 vnsjkdv other time
5 23 2222 2 vfdvfvf some other time
6 43 42 3 joining time
7 32 353 5 joining time
8 54 35 5 vsdvsdd time
Here's the expected result: it should be the sum of the id no 2, 7 only
ie. 3250+353=whatever.
It shouldn't include id no 6 because it has only 1 row with the same downmem_id.
Please help me to make this query.
Another approach is two levels of aggregation:
select sum(t.commission) income
from (select sum(case when type = 'joining' then commission end) as commission
from t
group by downmem_id
having count(*) > 1
) t;
The main advantage to this approach is that this more readily supports more complex conditions on the other members of each group -- such as at most one "joining" record or both "joining" records and no more than two "vnsjkdv" records.
Use EXISTS:
select sum(t.commission) income
from tablename t
where t.type = 'joining'
and exists (
select 1 from tablename
where id <> t.id and downmem_id = t.downmem_id
)
See the demo.
Results:
| income |
| ----- |
| 3603 |
You can use subquery that will find all downmem_id having more than one occurrence in the table.
SELECT Sum(commission) AS income
FROM tablename
WHERE type = 'joining'
AND downmem_id IN (SELECT downmem_id
FROM tablename t
GROUP BY downmem_id
HAVING Count(id) > 1);
DEMO

Trouble using group by to get a max value across two tables

I have been trying to solve a problem for a very long time- days- and I am not making any progress. Basically, I have two tables, players and matches. Each player in players has a unique player_id, as well as a group_id that identifies which group he/she belongs to. Each match in matches has the player_ids of two players in it, first_player and second_player, who are always from the same group. first_score corresponds to the score that first_player scores and second_score corresponds to the score that second_player scores. A match is won by who ever scores more. Here are the two tables:
create table players (
player_id integer not null unique,
group_id integer not null
);
create table matches (
match_id integer not null unique,
first_player integer not null,
second_player integer not null,
first_score integer not null,
second_score integer not null
);
Now what I am trying to do is to get the players with the most wins from each group, their group ID as well as the number of wins. So, for example, if there are three groups, the result would be something like:
Group Player Wins
1 24 23
2 13 25
3 34 20
Here's what I have right now
SELECT p1.group_id AS Group, p1.player_id AS Player, COUNT(*) AS Wins
FROM players p1, matches m1
WHERE (m1.first_player = p1.player_id AND m1.first_score > m1.second_score)
OR (m1.second_player = p1.player_id AND m1.second_score > m1.first_score)
GROUP BY p1.group_id
HAVING COUNT(*) >= (
SELECT COUNT(*)
FROM players p2, matches m2
WHERE p2.group_id = p1.group_id AND
((m2.first_player = p2.player_id AND m2.first_score > m2.second_score)
OR (m2.second_player = p2.player_id AND m2.second_score > m2.first_score))
)
My idea is to only select players whose wins are greater than, or equal to, the wins of all other players in his group. There is some syntactic problem with my query. I think I am using GROUP BY incorrectly as well.
There is also the issue of a tie in the number of wins, where I should just get the player with the least player_id. But I haven't even gotten to that point yet. I would really appreciate your help, thanks!
EDIT 1
I have a few sample data that I am running my query against.
SELECT * FROM players gives me this:
Player_ID Group_ID
100 1
200 1
300 1
400 2
500 2
600 3
700 3
SELECT * FROM matches gives me this:
match_id first_player second_player first_score second_score
1 100 200 10 20
2 200 300 30 20
3 400 500 30 10
4 500 400 20 20
5 600 700 20 10
So, the query should return:
Group Player Wins
1 200 2
2 400 1
3 600 1
Running the query as is returns the following error:
ERROR: column "p1.player_id" must appear in the GROUP BY clause or be used in an aggregate function
Now I understand that I have to specify player_id in the GROUP BY clause if I want to use it in the SELECT (or HAVING) statement, but I do not wish to group by player ID, only by the group ID.
Even if I do add p1.player_id to GROUP BY in my outer query, I get...the correct answer actually. But I am a bit confused. Doesn't Group By aggregate the table according to that column? Logically speaking, I only want to group by p1.group_id.
Also, if I were to have multiple players in a group with the highest number of wins, how can I just keep the one with the lowest player_id?
Edit 2
If I change the matches table to such that for Group 1, there are two players with 1 win each, the query result omits Group 1 from the result altogether.
So, if my matches table is:
match_id first_player second_player first_score second_score
1 100 200 10 20
2 200 300 10* 20
3 400 500 30 10
4 500 400 20 20
5 600 700 20 10
I would expect the result to be
Group Player Wins
1 200 1
1 300 1
2 400 1
3 600 1
However, I get the following:
Group Player Wins
2 400 1
3 600 1
Note that the desired result is
Group Player Wins
1 200 1
2 400 1
3 600 1
Since I wish to only take the player with the least player_id in the case of a draw.
WITH first_players AS (
SELECT group_id,player_id,SUM(first_score) AS scores FROM players p LEFT JOIN matches m ON p.player_id=m.first_player GROUP BY group_id,player_id
),
second_players AS (
SELECT group_id,player_id,SUM(second_score) AS scores FROM players p LEFT JOIN matches m ON p.player_id=m.second_player GROUP BY group_id,player_id
),
all_players AS (
WITH al AS (
SELECT group_id, player_id, scores FROM first_players
UNION ALL
SELECT group_id, player_id, scores FROM second_players
)
SELECT group_id, player_id,COALESCE(SUM(scores),0) AS scores FROM al GROUP BY group_id, player_id
),
players_rank AS (
SELECT *,
ROW_NUMBER() OVER(PARTITION BY group_id ORDER BY scores DESC, player_id ASC) AS score_rank,
ROW_NUMBER() OVER(PARTITION BY scores ORDER BY player_id ASC) AS id_rank FROM all_players ORDER BY group_id
)
SELECT group_id, player_id AS winner_id FROM players_rank WHERE score_rank=1 AND id_rank=1
Results
group_id winner_id
1 45
2 20
3 40
Try it Out
try like below
with cte as
(
select p.Group_ID,t1.winplayer,t1.numberofwin
row_number()over(partition by p.Group_ID order by t1.numberofwin desc,t1.winplayer) rn from players p join
(
SELECT count(*) as numberofwin,
case when first_score >second_score then first_player
else second_player end as winplayer
FROM matches group by case when first_score >second_score then first_player
else second_player end
) t1 on p.Player_ID =t1.winplayer
) select * from cte where rn=1
It works when you add the player_id in the GROUP BY because you know each player plays only in one group. So you group by the player in a certain group. That is why, logically, you can add the player_id to the GROUP BY.

Having the number of line having a specifc ID without group by SQL

I have a 'billing' table which represent all instances of billings from my subscribers. A subscriber can have multiple billings.
I have a simple SQL request which is :
SELECT count(billing_id),subscriber_id
FROM billing
group by subscriber_id
As a result I have a list of all my subscribers with the number of billings they've made.
I want to have a list of all the billings no grouped by subscribers, but I want the result of the previous request appearing in each lines.
Example:
Result of my previous request:
sub_id nb_billings
1 3
2 2
What I want :
sub_id nb_billings
1 3
1 3
1 3
2 2
2 2
Thanks
I'd do it like this;
SELECT
b.subscriber_id
,a.billing_count
FROM billing b
JOIN (SELECT subscriber_id, count(billing_id) billing_count FROM billing GROUP BY subscriber_id) a
ON b.subscriber_id = a.subscriber_id
The subquery works out the count of billing_id by subscriber, this is then joined to all rows of your original table (using subscriber_id). This should give the result you're after.
You can use a subquery to do that:
SELECT
(SELECT count(t2.billing_id) FROM billing t2 WHERE t2.subscriber_id = t1.subscriber_id),
t1.subscriber_id
FROM billing t1
I guess this should suffice :
SELECT s.subscriber_id,
s.billing_id,
s.TotalCount
FROM (
SELECT subscriber_id,
billing_id,
COUNT(billing_id) AS TotalCount
FROM BILLING
GROUP BY subscriber_id,
billing_id
) s
GROUP BY s.subscriber_id,
s.TotalCount,
s.billing_id
ORDER BY s.subscriber_id
This should give you the result as follows :
subscriber_id billing_id TotalCount
1 10a 2
1 10b 2
1 10c 1
2 10a 1
2 10b 1
2 10c 3
2 10d 1
You can see this here -> http://rextester.com/AVVS23801
Hope this helps!!
select subscriber_id,count(billing_id)over(partition by subscriber_id)
from billing
will do just that.

MySql count the total number of the row with a value that occurs at least two times

I want to select the total number of different class_id in which at least two students who share the same the birthday.
class_id student_id birthday
1 30 1994-10-01
1 23 1994-01-01
1 19 1994-02-01
1 11 1994-03-01
2 9 1994-02-01
2 43 1994-03-01
3 41 1994-06-01
3 21 1994-05-01
4 9 1992-05-22
4 20 1992-09-05
Write a subquery that finds all the duplicate birthdays in the same class. Then count the number of different classes with SELECT COUNT(DISTINCT class_id) from that subquery.
SELECT COUNT(DISTINCT class_id) FROM (
SELECT class_id, birthday
FROM YourTable
GROUP BY class_id, birthday
HAVING COUNT(*) > 1) AS x
In the inner select group by the class_id and take only those that have different numbers of unique and total birthdays.
Then count those class_ids in the outer select.
select count(*)
from
(
select class_id
from your_table
group by class_id
having count(*) > count(distinct birthday)
) tmp