Select top x records for each team - mysql

How do I select the x most recent records per team, Home and Away?
So the below gets me the most recent games for Swansea both home and away, how do I get it for all teams?
select d.date, d.hometeam, d.awayteam
from dump d
where
d.hometeam = 'Swansea'
or d.awayteam ='Swansea'
order by STR_TO_DATE(date, '%d/%m/%Y') desc limit 6
For an example of the data that I have. I'm using the CSV data provided at football-data.co.uk: http://www.football-data.co.uk/mmz4281/1415/E0.csv
I'm using MySQL however if there is a function or Stored Procedure which you find ideal for this purpose I can use SQL Server.
Edit: Expected Output
X | Date | Home Team | Away Team
----------------------------------------
Swansea| 23/03/15|Swansea |Arsenal
----------------------------------------
Swansea| 14/03/15|Man City |Swansea
----------------------------------------
Man Utd| 14/03/15|Man Utd |Man City
----------------------------------------
Man Utd| 14/03/15|Man Utd |Liverpool
Though if you have any suggestions on how better to present it I'm open to suggestions.
Where the left is the team in question, as the above table shows 2 per team, I'm trying to get 6 per team.

You just need to GROUP BY team and date:
SELECT d.team, d.date, d.hometeam, d.awayteam
FROM dump d
GROUP BY team, date
ORDER BY STR_TO_DATE(date, '%d/%m/%Y');

You want to get most recent 6 games for all team separately
Here are the 2 things need to take care.
In your schema you don't have specific column for team. So first you've to get all the team using HomeTeam & AwayTeam columns.
2nd thing you want to get 6 most recent games for each team. Means within the team group you've to do the ranking but mysql doesn't support ranking function. Although we've an alternative to for ranking functions.
based on my analysis here is the query. please try it.
SELECT
r.homeTeamOrAwayTeam AS team
, r.date
, r.hometeam
, r.awayteam
-- , r.rank
FROM (
SELECT
d.date,
d.hometeam,
d.awayteam,
subQuery.homeTeamOrAwayTeam,
CASE WHEN #runningElement = subQuery.homeTeamOrAwayTeam THEN #groupRank := (#groupRank + 1)
ELSE #groupRank := 1
END AS rank
, CASE WHEN #runningElement = subQuery.homeTeamOrAwayTeam THEN #runningElement := subQuery.homeTeamOrAwayTeam
ELSE #runningElement := subQuery.homeTeamOrAwayTeam
END AS runnigElement
FROM
dump d
JOIN (
-- to get all the hometeam & awayteam in one column
SELECT d.hometeam AS homeTeamOrAwayTeam FROM DUMP AS d
UNION
SELECT d.awayteam AS homeTeamOrAwayTeam FROM dump AS d
) AS subQuery
ON d.hometeam = subQuery.homeTeamOrAwayTeam OR d.awayteam = subQuery.homeTeamOrAwayTeam,
-- for ranking purpose
(SELECT #groupRank := 1) a,
(SELECT #runningElement := '') b
ORDER BY
subQuery.homeTeamOrAwayTeam,
STR_TO_DATE(d.date, '%d/%m/%Y')
) as r
-- set your criteria (e.g. if want to get only 6 results per team)
WHERE r.rank between 1 and 6

Related

Obtaining database joined table with limit [duplicate]

This question already has answers here:
mysql select top n max values
(4 answers)
Closed 5 years ago.
Ive been trying to join two tables but only showing a limited amount (2) of results from the joined table. Unfortunately I havent been able to obtain the correct results. These are my tables:
Destinations
id name
------------
1 Bahamas
2 Caribbean
3 Barbados
Sailings
id name destination
---------------------------------
1 Adventure 1
2 For Kids 2
3 All Inclusive 3
4 Seniors 1
5 Singles 2
6 Disney 1
7 Adults 2
This is the query Ive tried:
SELECT
d.name as Destination,
s.name as Sailing
FROM destinations d
JOIN sailings s
ON s.destination = d.id
LIMIT 2
But this gives me 2 due to the limit:
Destination Sailing
-------------------------
Bahamas Adventure
Caribbean For Kids
SAMPLE: SQL FIDDLE
I would like LIMIT 2 to be applied only to the joined table sailings
Expected Results:
Destination Sailing
-------------------------
Bahamas Adventure
Bahamas Seniors
Caribbean Singles
Caribbean For Kids
Can someone please point me in the right direction?
try
select tmp.name as destination,d.name as sailings from (
SELECT
id,
name,
destination
FROM
(
SELECT
id,
name,
destination,
#rn := IF(#p = destination, #rn + 1, 1) AS rn,
#p := destination
FROM sailings
JOIN (SELECT #p := NULL, #rn := 0) AS vars
ORDER BY destination
) AS T1
WHERE rn <= 2
)tmp
JOIN (SELECT * FROM destinations limit 0,2) d
ON(tmp.destination=d.id)
I have made 2 derived table and joined them
Your problem is that you want to take the two highest (or lowest) members of a group, for each group in the table. In this case, you want the first two sailings for each destination group.
The canonical way you would handle this query in a database which supported analytic functions would be to use ROW_NUMBER(). But since MySQL does not support this, we can simulate it using session variables:
SET #row_number = 0;
SET #destination = NULL;
SELECT
t.Destination,
t.Sailing
FROM
(
SELECT
#row_number:=CASE WHEN #destination = Destination
THEN #row_number + 1 ELSE 1 END AS rn,
#destination:=Destination AS Destination,
Sailing,
id
FROM
(
SELECT s.id AS id, d.name AS Destination, s.name AS Sailing
FROM destinations d
INNER JOIN sailings s
ON s.destination = d.id
) t
ORDER BY
Destination,
id
) t
WHERE t.rn <= 2
ORDER BY
t.Destination,
t.rn;
Note that Barbados appears as single row, because in your sample data it only has one sailing. If you also want to restrict to only destinations having two or more sailings, this can also be done.
Output:
Demo here:
Rextester
Can you try
SELECT
d.name as Destination,
s.name as Sailing
FROM sailings s
JOIN (SELECT * from destinations LIMIT 2) d
ON s.destination = d.id
(You say you want to limit the sailings table, but I think you might want the limit on the destinations table, based on your expected output; you can adjust as necessary)

Select first and last match by column from a timestamp-ordered table in MySQL

Stackoverflow,
I need your help!
Say I have a table in MySQL that looks something like this:
-------------------------------------------------
OWNER_ID | ENTRY_ID | VEHICLE | TIME | LOCATION
-------------------------------------------------
1|1|123456|2016-01-01 00:00:00|A
1|2|123456|2016-01-01 00:01:00|B
1|3|123456|2016-01-01 00:02:00|C
1|4|123456|2016-01-01 00:03:00|C
1|5|123456|2016-01-01 00:04:00|B
1|6|123456|2016-01-01 00:05:00|A
1|7|123456|2016-01-01 00:06:00|A
...
1|999|123456|2016-01-01 09:10:00|A
1|1000|123456|2016-01-01 09:11:00|A
1|1001|123456|2016-01-01 09:12:00|B
1|1002|123456|2016-01-01 09:13:00|C
1|1003|123456|2016-01-01 09:14:00|C
1|1004|123456|2016-01-01 09:15:00|B
...
Please note that the table schema is just made up so I can explain
what I'm trying to accomplish...
Imagine that from ENTRY_ID 6 through 999, the LOCATION column is "A". All I need for my application is basically rows 1-6, then row 1000 onwards. Everything from row 7 to 999 is unnecessary data that doesn't need to be processed further. What I am struggling to do is either disregard those lines without having to move the processing of the data into my application, or better yet, delete them.
I'm scratching my head with this because:
1) I can't sort by LOCATION then just take the first and last entries, because the time order is important to my application and this will become lost - for example, if I processed this data in this way, I would end up with row 1 and row 1000, losing row 6.
2) I'd prefer to not move the processing of this data to my application, this data is superfluous to my requirements and there is simply no point keeping it if I can avoid it.
Given the above example data, what I want to end up with once I have a solution would be:
-------------------------------------------------
OWNER_ID | ENTRY_ID | VEHICLE | TIME | LOCATION
-------------------------------------------------
1|1|123456|2016-01-01 00:00:00|A
1|2|123456|2016-01-01 00:01:00|B
1|3|123456|2016-01-01 00:02:00|C
1|4|123456|2016-01-01 00:03:00|C
1|5|123456|2016-01-01 00:04:00|B
1|6|123456|2016-01-01 00:05:00|A
1|1000|123456|2016-01-01 09:11:00|A
1|1001|123456|2016-01-01 09:12:00|B
1|1002|123456|2016-01-01 09:13:00|C
1|1003|123456|2016-01-01 09:14:00|C
1|1004|123456|2016-01-01 09:15:00|B
...
Hopefully I'm making sense here and not missing something obvious!
#Aliester - Is there a way to determine that a row doesn't need to be
processed from the data contained within that row?
Unfortunately not.
#O. Jones - It sounds like you're hoping to determine the earliest and
latest timestamp in your table for each distinct value of ENTRY_ID,
and then retrieve the detail rows from the table matching those
timestamps. Is that correct? Are your ENTRY_ID values unique? Are they
guaranteed to be in ascending time order? Your query can be made
cheaper if that is true. Please, if you have time, edit your question
to clarify these points.
I'm trying to find the arrival time at a location, followed by the departure time from that location. Yes, ENTRY_ID is a unique field, but you cannot take it as a given that an earlier ENTRY_ID will equal an earlier timestamp - the incoming data is sent from a GPS unit on a vehicle and is NOT necessarily processed in the order they are sent due to network limitations.
This is a tricky problem to solve in SQL because SQL is about sets of data, not sequences of data. It's extra tricky in MySQL because other SQL variants have a synthetic ROWNUM function and MySQL doesn't as of late 2016.
You need the union of two sets of data here.
the set of rows of your database immediately before, in time, a change in location.
the set of rows immediately after a change in location.
To get that, you need to start with a subquery that generates all your rows, ordered by VEHICLE then TIME, with row numbers. (http://sqlfiddle.com/#!9/6c3bc7/2/0) Please notice that the sample data in Sql Fiddle is different from your sample data.
SELECT (#rowa := #rowa + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowa := 0) init
ORDER BY VEHICLE, TIME
Then you need to self-join that subquery, use the ON clause to exclude consecutive rows at the same location, and take the rows right before a change in location. Comparing consecutive rows is done by ON ... b.rownum = a.rownum+1. That is this query. (http://sqlfiddle.com/#!9/6c3bc7/1/0)
SELECT a.*
FROM (
SELECT (#rowa := #rowa + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowa := 0) init
ORDER BY VEHICLE, TIME
) a
JOIN (
SELECT (#rowb := #rowb + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowb := 0) init
ORDER BY VEHICLE, TIME
) b ON a.VEHICLE = b.VEHICLE
AND b.rownum = a.rownum + 1
AND a.location <> b.location
A variant of this subquery, where you say SELECT b.*, gets the rows right after a change in location (http://sqlfiddle.com/#!9/6c3bc7/3/0)
Finally, you take the setwise UNION of those two queries, order it appropriately, and you have your set of rows with the duplicate consecutive positions removed. Please notice that this gets quite verbose in MySQL because the nasty #rowa := #rowa + 1 hack used to generate row numbers has to use a different variable (#rowa, #rowb, etc) in each copy of the subquery. (http://sqlfiddle.com/#!9/6c3bc7/4/0)
SELECT a.*
FROM (
SELECT (#rowa := #rowa + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowa := 0) init
ORDER BY VEHICLE, TIME
) a
JOIN (
SELECT (#rowb := #rowb + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowb := 0) init
ORDER BY VEHICLE, TIME
) b ON a.VEHICLE = b.VEHICLE AND b.rownum = a.rownum + 1 AND a.location <> b.location
UNION
SELECT d.*
FROM (
SELECT (#rowc := #rowc + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowc := 0) init
ORDER BY VEHICLE, TIME
) c
JOIN (
SELECT (#rowd := #rowd + 1) rownum,
loc.*
FROM loc
JOIN (SELECT #rowd := 0) init
ORDER BY VEHICLE, TIME
) d ON c.VEHICLE = d.VEHICLE AND c.rownum = d.rownum - 1 AND c.location <> d.location
order by VEHICLE, TIME
And, in next-generation MySQL, available in beta now in MariaDB 10.2, this is much much easier. The new generation as common table expressions and row numbering.
with loc as
(
SELECT ROW_NUMBER() OVER (PARTITION BY VEHICLE ORDER BY time) rownum,
loc.*
FROM loc
)
select a.*
from loc a
join loc b ON a.VEHICLE = b.VEHICLE
AND b.rownum = a.rownum + 1
AND a.location <> b.location
union
select b.*
from loc a
join loc b ON a.VEHICLE = b.VEHICLE
AND b.rownum = a.rownum + 1
AND a.location <> b.location
order by vehicle, time

Sum Top 10 Values

I’ve searched and I know this has been asked before but I am struggling to get my head around what I can / can’t do.
My cycling club records race results each time a rider has entered a race. Each result is awarded points - 50 for 1st, 49 for 2nd etc.
So the table looks like
resultid(pk) | riderid(fk) | leaguepts
1 1 50
2 2 49
3 3 48
4 1 50
5 2 42
6 3 50
7 4 30
...etc
I am trying to extract the sum of top 10 points awarded for each riderid from the results table.
(the actual database is a bit more complicated with a table for rider name / rider id and also a race table so we can display the results of each race etc but I just want to get the basic league table query working first of all)
So I want to extract the sum of the top 10 best scores for each rider. Then display each riders score, in a descending league table.
So far I’ve only had success using UNION ALL e.g.
SELECT sum(points) AS pts from
(
SELECT points from `results`
WHERE riderid = 1
ORDER BY points DESC
LIMIT 10
) as riderpts
UNION ALL
SELECT sum(points) AS pts from
(
SELECT points from `results`
WHERE riderid = 2
ORDER BY points DESC
LIMIT 10
) as riderpts
ORDER BY pts DESC
But there could be up to 90-odd riders who have registered at least one score so this query could get very big.
I found this which looks like it should work for me but doesn't. Sum top 5 values in MySQL I changed the column names for my table but it seems to sum all results, not the top 10 for each rider.
Alternatively I could just issue a query for each rider id. Not good I guess?
Subquerying is a problem because I can't limit on the inner query?
Run a job (manual or cron) to update the league table periodically and just display the table results?
Edit (not sure if this is the correct etiquette or I should start a new thread?). Gordon answered the question below but in the meantime I tried to work this out for myself using one of the links below. I could get results that returned the top 10 scores for each rider with the query below
set #riderid = '';
set #riderrow = 1;
select riderid, leaguepts, row_number
from
(
select
riderid,
leaguepts,
#riderrow := if(#riderid = riderid, #riderrow + 1, 1) as row_number,
#riderid := riderid as dummy
from wp_tt_results order by riderid, leaguepts desc
) as x where x.row_number <= 10;
BUT I can't see what I would need to do next to get the sum of top 10 results per riderid?
In MySQL, the easiest way to do this is probably to use variables:
SELECT riderid, sum(points)
FROM (SELECT r.*,
(#rn := if(#r = riderid, #rn + 1,
if(#r := riderid, 1, 1)
)
) as seqnum
FROM results r CROSS JOIN
(SELECT #r := 0, #rn := 0) as wnw
ORDER BY riderid, points DESC
) r
WHERE seqnum <= 10
GROUP BY riderid;

How to get Cumulative count since month begin in mysql

ID int(11) (NULL) NO PRI (NULL)
CREATED_DATE datetime (NULL) YES (NULL)
As mentioned above is some of field of my table 'User'.I want number of total user and cumulative count grouped on date.I used below query in mysql.
SELECT q1.CREATED_DATE,q1.NO_OF_USER, (#runtot := #runtot + q1.NO_OF_USER) AS CUMM_REGISTRATION FROM (SELECT date(CREATED_DATE) AS CREATED_DATE,
COUNT(ID) AS NO_OF_USER FROM USER,(SELECT #runtot:=0) AS n GROUP BY CREATED_DATE ORDER BY CREATED_DATE) AS q1
Which is working fine.Now I want one more additional data which will be 'CUMULATIVE USER COUNT SINCE 1 AUGUST '.Is it possible to fetch this modifying above query or its better to handle in code?Please suggest.
You can do this by adding another variable and doing it in the code:
SELECT q1.CREATED_DATE, q1.NO_OF_USER,
(#runtot := #runtot + q1.NO_OF_USER) AS CUMM_REGISTRATION,
#Aug1tot := if(CREATED_DATE >=date('2013-08-01'), #Aug1tot + q1.NO_OF_USER, NULL) as CUMM_SINCE_Aug1
FROM (SELECT date(CREATED_DATE) AS CREATED_DATE,
COUNT(ID) AS NO_OF_USER
FROM USER cross join
(SELECT #runtot:=0, #Aug1tot := 0) n
GROUP BY date(CREATED_DATE)
ORDER BY date(CREATED_DATE);
) AS q1
I guess you have some function called MONTH(yourDate) in mySQL, where MONTH(15-Aug-2013) will return 8.
You could either group your results per MONTH(yourDate), or filter your original data for MONTH(yourDate) = 8. Be careful if your data runs along multiple years, as all dates where month = 8 will be cumulated. You could then add a sorting / filtering critera based on YEAR(yourDate)

More complex cumulative sum column in mysql

I read Create a Cumulative Sum Column in MySQL, and tried to adapt it to what I'm doing, but I can't seem to get it right.
The table:
Id (primary key)
AcctId
CommodId
Date
PnL
There is a unique index which contains AcctId, CommodId, Date. I want to get a cumulative total grouped by date.
This query
select c.date
, c.pnl
,(#cum := #cum + c.pnl) as "cum_pnl"
from commoddata c join (select #cum := 0) r
where
c.acctid = 2
and
c.date >= "2011-01-01"
and
c.date <= "2011-01-31"
order by c.date
will correctly calculate the running total for all records, showing data in the format
date pnl cum_pnl
======== ====== =======
2011-01-01 1 1
2011-01-01 1 2
2011-01-01 1 3
2011-01-01 1 4
2011-01-02 1 5
2011-01-02 1 6
...
(there can be many records per date). What I want is
date cum_pnl
======== =======
2011-01-01 4
2011-01-02 6
...
But nothing I've tried works. TIA.
Alternately I think you can replace all your pnl with sum(pnl), and let your #cum run across those. I think it would look like this:
select c.date
,SUM(c.pnl)
,(#cum := #cum + SUM(c.pnl)) as "cum_pnl"
from commoddata c join (select #cum := 0) r
where
c.acctid = 2 and c.date >= "2011-01-01" and c.date <= "2011-01-31"
order by c.date
GROUP BY c.date
I'm just trying to figure out if SQL will give you grief over selecting cum_pnl when it is not a group by expression... maybe you can try grouping by it as well?
EDIT New Idea, if you're really not averse to nested queries, replace commoddata with a summed grouped query
select c.date
,c.pnl
,(#cum := #cum + c.pnl) as "cum_pnl"
from
(SELECT date, sum(pnl) as pnl FROM commoddata WHERE [conditions] GROUP BY date) c
join (select #cum := 0) r
order by c.date
Probably not the best way, but you can SELECT Date, MAX(Cum_PnL) FROM (existing_query_here) GROUP BY Date