Retrieving a variable number of rows using a table join

Retrieving a variable number of rows using a table join - mysql

This is an addition layer of complexity on another question I asked here: Using GROUP BY and ORDER BY in same MySQL query
Same table structure and problem, except this time imagine that the past_election table is now set up as...
| election_ID | Date | jurisdiction | Race | Seats |
|-------------|------------|----------------|---------------|-------|
| 1 | 2016-11-08 | federal | president | 1 |
| 2 | 2016-11-08 | state_district | state senator | 2 |
(last record has seats set as 2 instead of 1.)
I want to use the Seats number to grab different numbers of records, ordered by the number of votes, for each group. So in this case with the following additional tables...
candidates
| Candidate_ID | FirstName | LastName | MiddleName |
|--------------|-----------|----------|------------|
| 1 | Aladdin | Arabia | A. |
| 2 | Long | Silver | John |
| 3 | Thor | Odinson | NULL |
| 4 | Baba | Yaga | NULL |
| 5 | Robin | Hood | Locksley |
| 6 | Sherlock | Holmes | J. |
| 7 | King | Kong | Null |
past_elections-candidates
| ID | PastElection | Candidate | Votes |
|----|--------------|-----------|-------|
| 1 | 1 | 1 | 200 |
| 2 | 1 | 2 | 100 |
| 3 | 1 | 6 | 50 |
| 4 | 2 | 3 | 75 |
| 5 | 2 | 4 | 25 |
| 6 | 2 | 5 | 150 |
| 7 | 2 | 7 | 100 |
I would expect the following output:
| election_ID | FirstName | LastName | votes | percent |
|-------------|-----------|----------|-------|---------|
| 1 | Aladdin | Arabia | 200 | 0.5714 |
| 2 | Robin | Hood | 150 | 0.4286 |
| 2 | King | Kong | 100 | 0.2857 |
I've tried setting a variable and using that with a LIMIT statement but variables don't work in limits. I've also tried using ROW_NUMBER() (I'm not using MySQL 8.0 so this won't work but I'd be willing to upgrade if it did) or a related workaround like #row_number := IF ... and then filtering based on the row number but nothing has worked.
Last tried query:
SELECT pe.election_ID as elec,
pe.Seats as s,
pecs.row_num,
c.FirstName,
c.LastName,
pecs.max_votes AS votes,
pecs.max_votes / pecs.total_votes AS percent
FROM past_elections pe
JOIN `past_elections-candidates` pec ON pec.PastElection = pe.election_ID
JOIN (SELECT PastElection,
Candidate,
#row_num := IF(PastElection = #current_election, #current_election + 1, 1) as row_num,
MAX(Votes) AS max_votes,
SUM(Votes) AS total_votes,
#current_election := PastElection
FROM `past_elections-candidates`
GROUP BY PastElection) pecs ON pecs.PastElection = pec.PastElection AND pecs.row_num <= pe.Seats
JOIN candidates c ON c.Candidate_ID = pec.Candidate

Use MySQL 8 regardless ;)
Use ROW_NUMBER to order the past elections:
SELECT *, ROW_NUMBER() OVER(PARTITION BY pastelection ORDER BY votes DESC) as rown
FROM `past_elections-candidates`
Join this to past_elections as a subquery (this is just the bit you're stuck on with the "using pe.seats to vary the number of rows returned per election" and doesn't include the percent bits:
SELECT *
FROM
past_elections pe
INNER JOIN
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY pastelection ORDER BY votes DESC) as rown
FROM `past_elections-candidates`
) pecr
ON pecr.pastelection = pe.electionid AND
pecr.rown <= pe.seats
If you want to test things out on 8 before you upgrade, loads of the db fiddle sites support v8
ps; percent-y stuff can be done at the same time as the ROW_NUMBER with eg:
votes/SUM(votes) OVER(PARTITION BY past_election)
eg for election ID 1 that sum will be 200+100+50, giving 200/350 = ~57%
SELECT *, votes/SUM(votes) OVER(PARTITION BY past_election) as pcnt, ROW_NUMBER() OVER(PARTITION BY pastelection ORDER BY votes DESC) as rown
FROM `past_elections-candidates`
You need to calc it before filtering

I don't have the right fields listed but this is as close as I'll probably get tonight... I've gotten the rows I need but need to join the candidate table to get the name out...
Using Dense_Rank seems to work for this...
SELECT * FROM (
SELECT pec.PastElection,
c.FirstName,
c.LastName,
pec.Votes,
pecs.totalVotes,
pe.Seats as s,
DENSE_RANK() OVER(PARTITION BY PastElection ORDER BY Votes DESC) as rank_votes
FROM `past_elections-candidates` pec
JOIN (SELECT PastElection,
Max(Votes) as maxVotes,
Sum(Votes) as totalVotes
FROM `past_elections-candidates`
GROUP BY PastElection) pecs ON pecs.PastElection = pec.PastElection
JOIN `past_elections` pe ON pec.PastElection = pe.election_ID
JOIN candidates c ON c.Candidate_ID = pec.Candidate
) t WHERE rank_votes <= s;
This results in
| PastElection | FirstName | LastName | Votes | totalVotes | s | rank_votes |
|--------------|-----------|----------|-------|------------|---|------------|
| 1 | Aladdin | Arabia | 200 | 350 | 1 | 1 |
| 2 | Robin | Hood | 150 | 350 | 2 | 1 |
| 2 | King | Kong | 100 | 350 | 2 | 2 |
I guess it's just kind of messy having the rank_votes and s columns in the data, but that's honestly fine with me if it gets the results I need.

Related

Integer incremented by line displayed

This is team table :
+----+-------+--------+-------+
| id | alias | pwd | score |
+----+-------+--------+-------+
| 1 | login | mdp | 5 |
| 2 | azert | qsdfgh | 50 |
| 3 | test | test | 780 |
+----+-------+--------+-------+
This is activity table
+----+--------------+---------------------+-------+--------+
| id | localisation | name | point | answer |
+----+--------------+---------------------+-------+--------+
| 1 | Madras | Lancement du projet | 0 | NULL |
| 2 | Valparaiso | act1 | 450 | un |
| 3 | Amphi | act2 | 45 | deux |
| 4 | Amphix | act3 | 453 | trois |
| 5 | Amphix | act4 | 45553 | qautre |
| 6 | Madras | Lancement du projet | 0 | NULL |
| 7 | Valparaiso | act1 | 450 | un |
| 8 | Amphi | act2 | 45 | deux |
| 9 | Amphix | act3 | 453 | trois |
| 10 | Amphix | act4 | 40053 | fin |
+----+--------------+---------------------+-------+--------+
This is feed table :
+--------+---------------------+------------+--------+
| FeedId | ts | ActivityId | TeamId |
+--------+---------------------+------------+--------+
| 1 | 2023-01-10 00:02:06 | 1 | 3 |
| 2 | 2023-01-10 00:02:28 | 2 | 3 |
| 3 | 2023-01-10 00:21:13 | 3 | 3 |
| 4 | 2023-01-10 00:24:49 | 3 | 3 |
| 5 | 2023-01-10 00:30:58 | 1 | 1 |
+--------+---------------------+------------+--------+
I did this
MariaDB [sae]> SELECT #rownum:=#rownum+1 as 'Classement', t.alias, SUM(a.point) as total_points FROM activity a INNER JOIN feed f ON a.id = f.ActivityId INNER JOIN team t ON f.TeamId = t.id JOIN (SELECT #rownum:=0) r GROUP BY t.alias ORDER BY total_points DESC, Classement DESC;
+------------+-------+--------------+
| Classement | alias | total_points |
+------------+-------+--------------+
| 2 | test | 540 |
| 1 | login | 0 |
+------------+-------+--------------+
Here the team with the highest number of points contains the ranking 2 instead of one and if I sort by ASC Ranking it does not change anything.
I wish to this :
+------------+-------+--------------+
| Classement | alias | total_points |
+------------+-------+--------------+
| 1 | test | 540 |
| 2 | login | 0 |
+------------+-------+--------------+
Do you have any idea how to go about incrementing this "backwards" integer?

Unless you are using an eoled version of MariaDB you should use WINDOW function RANK() instead of dealing with user variables.
Working with user variable increment returns the same value as ROW_NUMBER() but this is not correct, since teams with the same score should get the same ranking.
SELECT RANK() OVER (ORDER BY subq.total_points DESC) AS 'Classement',
subq.* FROM (
SELECT team.alias, SUM(activity.point) AS total_points FROM activity
JOIN feed ON activity.id = feed.ActivityId
JOIN team ON feed.TeamId = team.id GROUP BY team.alias ) AS subq

This will handle the case if two or more teams have the same score. both of them will have the same ranking :
This is compatible with all versions of mysql and mariadb.
select #rank := CASE
WHEN #totalval = total_points THEN #rank
WHEN (#totalval := total_points) IS NOT NULL THEN #rank + 1
WHEN (#totalval := total_points) IS NOT NULL THEN 1
END AS rank,
s.*
from (
SELECT t.alias, SUM(a.point) as total_points
FROM activity a
INNER JOIN feed f ON a.id = f.ActivityId
INNER JOIN team t ON f.TeamId = t.id
JOIN (SELECT #rank:=0, #totalval := 0) r
GROUP BY t.alias
ORDER BY total_points DESC
) as s;
Check it from here : https://dbfiddle.uk/7lKLu4Pw

Using the same logic as yours, You can do it as follows :
select #rownum:=#rownum+1 as 'Classement', s.*
from (
SELECT t.alias, SUM(a.point) as total_points
FROM activity a
INNER JOIN feed f ON a.id = f.ActivityId
INNER JOIN team t ON f.TeamId = t.id
JOIN (SELECT #rownum:=0) r
GROUP BY t.alias
ORDER BY total_points DESC
) as s;
Check it here : https://dbfiddle.uk/TEz3UT97
Its working on mysql and mariadb

MySQL: Finding the most efficient use of INNER JOIN with subquery

I have a working query using INNER JOIN and a subquery but was wondering if there is a more effient way of writing it.
with prl
as
(
SELECT `number`, creator, notes
FROM ratings
INNER JOIN
projects on ratings.project_id = projects.project_id
WHERE ratings.rating = 5 AND projects.active = 1
)
SELECT prl.`number`, creator, notes
FROM prl
INNER JOIN(
SELECT `number`
HAVING COUNT(creator) > 1
)temp ON prl.`number` = temp.`number`
ORDER BY temp.`number`
projects table
project_id| number | creator | active |
| 1 | 3 | bob | 1 |
| 2 | 4 | mary | 1 |
| 3 | 5 | asi | 1 |
rating table
project_id| notes | rating |
| 1 | note1 | 5 |
| 1 | note2 | 5 |
| 3 | note3 | 5 |
| 1 | note4 | 1 |
| 2 | note5 | 5 |
| 3 | note6 | 2 |
result
| number | creator | notes |
| 3 | bob | note1 |
| 3 | bob | note2 |

It seems like you're using MySQL version that support window function. If so, then try this:
SELECT number, creator, notes
FROM
(SELECT p.number, p.creator, r.notes,
COUNT(creator) OVER (PARTITION BY creator) AS cnt
FROM project p
JOIN rating r ON p.project_id=r.project_id
WHERE r.rating=5
AND p.active = 1) v
WHERE cnt=2;
As far as whether this is more efficient, I'm not really sure because it depends in your table indexes but for a small dataset, I assume this will do well.
Demo fiddle

MySQL query using limit and offset does not return expected ordered results

I am trying to get the count of entries by users grouped by year, month and user name from a table which has 45M entries. The query result has around 4M records which I wasn't able to get in one go so I decided to use limit and offset.
To retrieve the first 1M records I've written the query below:
select SQL_BIG_RESULT uis.nick, uis.user_id, CONCAT(t.year, '-', LPAD(t.month, 2, 0)) AS DATE, t.count
from (select SQL_BIG_RESULT e.user_id, YEAR(e.created_at) as year, MONTH(e.created_at) as month, COUNT(*) AS count
from entries e
group by YEAR(e.created_at), MONTH(e.created_at), e.user_id
limit 1000000
) t
inner join users u on u.id = t.user_id
inner join user_infos ui on ui.user_id = u.id
inner join user_identifiers uis on uis.user_info_id = ui.id
order by t.year, t.month, uis.nick;
To retrieve the second 1M records I've set an offset of 999998 so I would have 2 overlapping rows so that I could double check that it's correct, hence this query below:
select SQL_BIG_RESULT uis.nick, uis.user_id, CONCAT(t.year, '-', LPAD(t.month, 2, 0)) AS DATE, t.count
from (select SQL_BIG_RESULT e.user_id, YEAR(e.created_at) as year, MONTH(e.created_at) as month, COUNT(*) AS count
from entries e
group by YEAR(e.created_at), MONTH(e.created_at), e.user_id
limit 999998, 1000000
) t
inner join users u on u.id = t.user_id
inner join user_infos ui on ui.user_id = u.id
inner join user_identifiers uis on uis.user_info_id = ui.id
order by t.year, t.month, uis.nick;
Then to compare the results and double check, I've got the tail of the first 1M records and the head of the second 1M records. There should be 2 overlapping records in my understanding -since I've used an offset of 999998- but there is something wrong.
It's also evident that there is something wrong with the query because the first file ends with zzzzz but then the second file starts with 0 3 kalem ucu which should not be after z in alphabetical order.
$ tail entry_counts_by_users_1_1m.csv
| user_nick | user_id | date | entry_count |
|-------------|---------|---------|-------------|
| zskal | 493395 | 2013-05 | 8 |
| zuhanzee | 397659 | 2013-05 | 2 |
| zulmet | 446672 | 2013-05 | 74 |
| zuluuuuuu | 1240043 | 2013-05 | 9 |
| zverkov | 502616 | 2013-05 | 2 |
| zvezdite | 750458 | 2013-05 | 1 |
| zx | 249598 | 2013-05 | 15 |
| zyprexa 5mg | 779519 | 2013-05 | 16 |
| zzgx | 584985 | 2013-05 | 2 |
| zzzzz | 22730 | 2013-05 | 1 |
$ head entry_counts_by_users_1m_2m.csv
| nick | user_id | DATE | count |
|---------------|---------|---------|-------|
| 0 3 kalem ucu | 624699 | 2013-05 | 4 |
| 0132 | 995914 | 2013-05 | 3 |
| 03072010 | 960606 | 2013-05 | 9 |
| 0312020008 | 804486 | 2013-05 | 2 |
| 0326 | 446816 | 2013-05 | 1 |
| 05 | 575534 | 2013-05 | 1 |
| 05012009 | 1171153 | 2013-05 | 6 |
| 0904 | 514964 | 2013-05 | 2 |
| 0kmzeka | 777191 | 2013-05 | 4 |
Could you help me understand what I am doing wrong here?
+-----------+
| ##version |
+-----------+
| 8.0.19 |
+-----------+
UPDATE
These are the results I get after using ORDER BY in my subquery:
select SQL_BIG_RESULT uis.nick, uis.user_id, CONCAT(t.year, '-', LPAD(t.month, 2, 0)) AS DATE, t.count
from (select SQL_BIG_RESULT e.user_id, YEAR(e.created_at) as year, MONTH(e.created_at) as month, COUNT(*) AS count
from entries e
group by YEAR(e.created_at), MONTH(e.created_at), e.user_id
order by year, month, user_id
limit 1000000) t
inner join users u on u.id = t.user_id
inner join user_infos ui on ui.user_id = u.id
inner join user_identifiers uis on uis.user_info_id = ui.id
For the first 1M records:
$ tail entry_counts_by_users_1_1m.csv
| user_name | user_id | date | entry_count |
|----------------------------|---------|---------|-------------|
| statistic er | 667546 | 2012-06 | 1 |
| mula | 612905 | 2013-02 | 1 |
| sisman cirkin bi de kezban | 1327434 | 2013-02 | 2 |
| tyra34 | 1329280 | 2013-03 | 1 |
| ecemazkan | 1332628 | 2013-02 | 1 |
| susamlicubuk | 1333079 | 2013-02 | 1 |
| hemenhemenherterim | 631784 | 2011-04 | 1 |
| umursamaz tavrin hastasi | 1060158 | 2012-09 | 2 |
| uslucocuk | 1254758 | 2012-09 | 1 |
| dharamsala | 956110 | 2012-09 | 1 |
select SQL_BIG_RESULT uis.nick, uis.user_id, CONCAT(t.year, '-', LPAD(t.month, 2, 0)) AS DATE, t.count
from (select SQL_BIG_RESULT e.user_id, YEAR(e.created_at) as year, MONTH(e.created_at) as month, COUNT(*) AS count
from entries e
group by YEAR(e.created_at), MONTH(e.created_at), e.user_id
order by year, month, user_id
limit 999998, 1000000) t
inner join users u on u.id = t.user_id
inner join user_infos ui on ui.user_id = u.id
inner join user_identifiers uis on uis.user_info_id = ui.id
For the second 1M records:
$ head entry_counts_by_users_1m_2m.csv
| user_name | user_id | date | entry_count |
|-----------|---------|---------|-------------|
| ssg | 8097 | 2013-06 | 101 |
| ssg | 8097 | 2013-07 | 73 |
| ssg | 8097 | 2013-08 | 100 |
| ssg | 8097 | 2013-09 | 88 |
| ssg | 8097 | 2013-10 | 84 |
| ssg | 8097 | 2013-11 | 54 |
| ssg | 8097 | 2013-12 | 64 |
| ssg | 8097 | 2014-01 | 78 |
| ssg | 8097 | 2014-02 | 31 |
I still don't get what I am doing wrong.

Starting in MySQL 8.0.13, implicit ordering for GROUP BY has been removed:
Incompatible Change: The deprecated ASC or DESC qualifiers for GROUP BY clauses have been removed. Queries that previously relied on GROUP BY sorting may produce results that differ from previous MySQL versions. To produce a given sort order, provide an ORDER BY clause.
The implicit ordering has been deprecated since 5.6, so there has been some warning.
Your subquery is using GROUP BY with no ORDER BY. The ordering of the result set is not specified and it might change from one run to the next. To produce a stable result, using an ORDER BY before the LIMIT.

I need to select a group total as well as the individual rows?

I have the following result set...
Name | Team | Score
A | 1 | 10
B | 1 | 11
C | 2 | 9
D | 2 | 15
and I want to add an extra column to the results set for the team score so I can sort on it and end up with the following data set...
Name | Team | Score | TeamScore
D | 2 | 15 | 24
C | 2 | 9 | 24
B | 1 | 11 | 21
A | 1 | 10 | 21
So I end up with the top team first with the members in order.
My actual data is way more complicated than this and pulls in data from several tables but if you can solve this one I can solve my bigger issue!

Join the table to a query that returns the total for each team:
select t.*, s.teamscore
from tablename t
inner join (
select team, sum(score) teamscore
from tablename
group by team
) s on s.team = t.team
order by s.teamscore desc, t.team, t.score desc
See the demo.
Results:
| Name | Team | Score | teamscore |
| ---- | ---- | ----- | --------- |
| D | 2 | 15 | 24 |
| C | 2 | 9 | 24 |
| B | 1 | 11 | 21 |
| A | 1 | 10 | 21 |

In MySQL 8+, we can simplify and just use SUM as an analytic function:
SELECT
Name,
Team,
Score,
SUM(Score) OVER (PARTITION BY Team) AS TeamScore
FROM yourTable
ORDER BY
TeamScore DESC,
Score;

How to generate auto increment temporary field in MySQL with join and order by clause

Here is my poeples tables
+----+------------------+------------+
| id | name | address |
+----+------------------+------------+
| 1 | Tony Stark | Chicago |
| 2 | Natasha Romanoff | Boston |
| 3 | Steve Rogers | Arkansas |
| 4 | Bruce Banner | Long Beach |
+----+------------------+------------+
Here is my roles table
+----+-----------------+-----------+
| id | role_name | people_id |
+----+-----------------+-----------+
| 1 | Iron Man | 1 |
| 2 | Black Widow | 2 |
| 3 | Captain America | 3 |
| 4 | Hulk | 4 |
+----+-----------------+-----------+
I want to get the data from that 2 tables with generate auto increment sequential number field with this query
SELECT #rownum := #rownum + 1 as no, peoples.name, roles.role_name
FROM peoples
CROSS JOIN (select #rownum := 0) r
JOIN roles ON roles.people_id = peoples.id
ORDER BY peoples.name ASC
But the result is not what i expect. Here is the result
+------+------------------+-----------------+
| no | name | role_name |
+------+------------------+-----------------+
| 4 | Bruce Banner | Hulk |
| 2 | Natasha Romanoff | Black Widow |
| 3 | Steve Rogers | Captain America |
| 1 | Tony Stark | Iron Man |
+------+------------------+-----------------+
Maybe it's because of the JOIN and ORDER BY. How to fix that so i get a sequential number?

If you want to apply a row number by name over the peoples table, you should first generate it in a subquery, and then join to that subquery:
SELECT
p.no,
p.name,
r.role_name
FROM
(
SELECT id, name, address, #rownum:=#rownum + 1 AS no
FROM peoples
ORDER BY name
) p
CROSS JOIN (SELECT #rownum := 0) t
INNER JOIN roles r
ON p.id = r.people_id
ORDER BY
p.no;

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Retrieving a variable number of rows using a table join - mysql

Related

Integer incremented by line displayed

MySQL: Finding the most efficient use of INNER JOIN with subquery

MySQL query using limit and offset does not return expected ordered results

I need to select a group total as well as the individual rows?

How to generate auto increment temporary field in MySQL with join and order by clause

Categories

Resources