I'm looking for a method to do this in a "clean" way (not 3..n cross JOINS), just want to know if it's possible to do it in sql, if not I'll go for another solution.
Will use numbers instead of dates for simplification
I have n rows with n tasks and n items
task item start end
1 1 1 5
1 2 2 6
1 3 0 4
1 4 8 10
In this case I'm looking to use the min(start) max(end) of the overlapping dates so the result will be:
task item start end
1 1,2,3 0 6
1 4 8 10
Any ideas of how to resolve it in sql? is like a challenge, if can't do it this way I'll go to python.
Thank you
This similar to the problem I answered here, and similar data "island" problems. However, it is more complicated in your case as the identification of the "islands" will need to be calculated from more than just the record immediately prior.
It will end up looking something like this:
SET #iEnd = -1; /* init value should be something you don't expect to see */
SET #task = -1; /* init value should be something you don't expect to see */
SET #isNewIsland = 0 /* init value doesn't actually matter */;
SET #i = 0;
SELECT islandNum
, GROUP_CONCAT(item ORDER BY item) AS items
, MIN(start) AS iStart
, MAX(end) AS iEnd
FROM (
SELECT #isNewIsland := IF(#task <> task OR start > #iEnd, 1, 0)
, #task := task, item, start, end
, #i := IF(#isNewIsland = 1, #i + 1, #i) AS islandNum
, #end := IF(#isNewIsland = 1, end, GREATEST(end, #iEnd))
FROM ( /* Session(#) variables evaluation can be a bit unpredictable
the subquery helps guarantee ordering before evaluation */
SELECT task, item, start, end
FROM theTable
ORDER BY task, start, end
) AS subQ
) AS subQ 2
some are not fond of needing the separate, preceding SET statements; to avoid the need, replace ) AS subQ with
) AS subQ, (SELECT #iEnd := -1, #task := -1, #isNewIsland := 0, #i := 0) AS sInit
Related
I"m trying to add a new col that shows the rank (or sequence) of row results by date.
I've written:
SELECT
#row_number:=(CASE
WHEN #member_id = lh.member_id and lc.ladder_advocacy is not null
THEN #row_number + 1
when #member_id = lh.member_id and lc.ladder_advocacy is null then "null"
ELSE 1 /* there is an error here - i need it to return a 1 if not null, then 2 for the 2nd instance, etc */
END) AS rank_advocacy,
#member_id:=lh.member_id AS member_id,
lh.ladder_change,
lc.name,
lc.ladder_advocacy,
lc.ladder_elected,
lc.ladder_policy,
lc.ladder_engagement,
lc.ladder_newventure,
lc.ladder_collective,
lc.is_trigger
FROM
leenk_ladder_history AS lh
LEFT JOIN
leeds_so.leenk_ladder_config AS lc ON lh.ladder_config_id = lc.id
WHERE
ladder_change = 1 AND trigger_active = 1
ORDER BY member_id, trigger_event_date DESC;
There is an error at row 4, and I'm not sure how to fix it. For the first result, I want to return 1. for the second results, I want to return #row_number + 1. Third result, #row_number+2 (etc).
How do I achieve this?
I don't understand how the condition lc.ladder_advocacy is not null is being used. However, the basic structure is:
SELECT (#row_number = IF(#member_id = lh.member_id, #row_number + 1
IF(#member_id := lh.member_id, 1, 1)
)
) as rank_advocacy,
lh.ladder_change,
. . .
Some really important points:
You need to assign #member_id and #row_number in the same expression. MySQL (as with all other databases) does not guarantee the order of evaluation of expressions.
In more recent versions of MySQL, I think the ORDER BY needs to go in a subquery, with the variable expressions in the outer query.
I have a Leaderboard that looks like this:
|--------------------------------------|
| userId | allTimePoints | allTimeRank |
|--------------------------------------|
| .. | ... | ... |
| xx | 5555555 | ? |
| .. | ... | ... |
----------------------------------------
Let's assume the table has a million records, and that allTimePoints is updated constantly. When a user asks to see the Leaderboard, I'd like to be able to show them their rank, score, as well as their closest competitors. I'd like to achieve the following:
figure out the rank of each user (sort table by allTimePoints DESC)
figure out paging offset so that leaderboard viewer is in the middle of reduced resultset
do it within an acceptable runtime (e.g. create perception of instant response even if hundreds of thousands of other users are also hitting the Leaderboard screen at the same time)
I've started like this and this takes about 0.4sec on my machine when the table has 1mil rows.
SET #rowIndex := 0;
SET #rank := 0;
SET #prev := NULL;
SET #userIdPosition := 0;
SELECT
#rowIndex := #rowIndex+1 AS rowIndex,
userId,
#rank := IF(#prev=allTimePoints, #rank, #rank+1) AS rank,
#prev := allTimePoints AS allTimePoints,
#userIdPosition := IF(userId=1860, #rowIndex, #userIdPosition) AS requestedOffset
FROM Leaderboard
ORDER BY allTimePoints DESC;
Btw, the runtime benefit of this method over using a self-join, is described here (it's much faster): http://code.openark.org/blog/mysql/sql-ranking-without-self-join
I keep rowIndex and rank as separate variables, so that I can calculate the requesting user's paging offset more accurately if there are rank ties (i.e., n users have same score).
So far so good, although I fear that if this doesn't reduce to msec runtime, it won't be viable when hundreds of thousands of users run the query simultaneously.
To make matters worse, if I expand this query to work correctly with paging as described above, then runtime increases to 1.5sec
SET #rowIndex := 0;
SET #rank := 0;
SET #prev := NULL;
SET #userIdPosition := 0;
SELECT sortedL.userId, sortedL.rank, sortedL.allTimePoints
FROM
(SELECT
#rowIndex := #rowIndex+1 AS rowIndex,
userId,
#rank := IF(#prev=allTimePoints, #rank, #rank+1) AS rank,
#prev := allTimePoints AS allTimePoints,
#userIdPosition := IF(userId=1860, #rowIndex, #userIdPosition) AS requestedOffset
FROM Leaderboard
ORDER BY allTimePoints DESC) AS sortedL
-- simulate paging, as LIMIT doesn't seem to accept variables
WHERE sortedL.rowIndex > sortedL.requestedOffset -15 AND sortedL.rowIndex < sortedL.requestedOffset + 15;
This returns 29 users and the requesting user is in the middle, as desired.
If I run this with EXPLAIN, I can see that the subquery is using a FILESORT, but the results are not indexed, and hence the outer SELECT is forced to do yet another full scan of the resultset using WHERE (slower than FILESORT).
Questions (1): how can I optimize this?
Another idea was to store the ranking in an indexed column: allTimeRank. I thought I'd experiment with sorting the table in a procedure on a schedule (say, every 10 min), and then offer very quick access with a simpler SELECT that would utilize the index. I haven't managed to get this to work properly, it doesn't seem to be using the condition in my WHERE clause (the ranking stored in allTimeRank is incorrect, and MySQL complains so I have to turn off safe updates to get it to even run)
SET SQL_SAFE_UPDATES=0;
SET #rowIndex := 0;
SET #rank := 0;
SET #prev := NULL;
SET #userIdPosition := 0;
UPDATE Leaderboard L,
(SELECT
#rowIndex := #rowIndex+1 AS rowIndex,
userId,
#rank := IF(#prev=allTimePoints, #rank, #rank+1) AS rank,
#prev := allTimePoints AS allTimePoints,
#userIdPosition := IF(userId=1860, #rowIndex, #userIdPosition) AS requestedOffset
FROM Leaderboard
ORDER BY allTimePoints DESC) AS sortedL
SET L.allTimeRank = sortedL.rank
WHERE sortedL.userId = L.userId;
SET SQL_SAFE_UPDATES=1;
Question (2): how do I make the WHERE condition work.
This has taken between 2min and 12 sec to run. Not sure why the inconsistency. In any case, this will block UPDATEs from users that are winning points, giving the sense that the app has hung. Question (3): is there a work around?
First thing, you are not computing Rank correctly. If there are three players: Britney(100 pts), Rachel(100 pts), and Susan(75 pts), then Britney and Rachel each have a rank of 1, and Susan should have a rank of 3. Your routine would give Susan a rank of 2.
Second, when players have the same score (and rank) they should display in a consistent order. The order within tied scores/ranks should be the order in which she attained that score.
I would add two columns to the table: allTimeRank, and allTimeRankOrder. And update in real time every time the points change. Realize that if my score goes from 100 to 125, the only users that need to be reranked are those that had scores from 100-124 -- just the people I jumped over.
Here is a routine to do it. It assumes points always go up, never down. I don't have a million row table to test with, but if you have the right indexes set up I hope it will run pretty fast.
CREATE PROCEDURE `updateUserPoints`(IN `puserid` VARCHAR(10), IN `pnewPoints` INT)
BEGIN
SET #currPoints = 0;
SET #currRank = 0;
SET #currRankOrder = 0;
SELECT allTimePoints, allTimeRank, allTimeRankOrder INTO #currPoints, #currRank, #currRankOrder from Leaderboard where userid = puserid;
SET #newRank = 0;
SET #newRankOrder = 0;
SELECT max(allTimeRank), max(allTimeRankOrder)+1 INTO #newRank, #newRankOrder FROM Leaderboard WHERE allTimePoints = pnewPoints;
IF (#newRank IS NULL) THEN
SET #newRank = (SELECT min(allTimeRank) from Leaderboard WHERE allTimePoints < pnewPoints);
SET #newRankOrder = 0;
END IF;
UPDATE Leaderboard
SET allTimePoints = pnewPoints,
allTimeRank = #newRank,
allTimeRankOrder = #newRankOrder
WHERE userid = puserid;
/* all the people that I was tied with, but ahead in order,
slide up one in the order */
UPDATE Leaderboard
SET allTimeRankOrder = allTimeRankOrder - 1
WHERE allTimeRank = #currRank
AND allTimeRankOrder > #currRankOrder;
/* did I jump anyone? Their rank goes down. */
UPDATE Leaderboard
SET allTimeRank = allTimeRank + 1
WHERE userid <> puserid
AND allTimePoints >= #currPoints
AND allTimePoints < pnewPoints;
END
table
create table tst(locationId int,
scheduleCount tinyint(1) DEFAULT 0,
displayFlag tinyint(1) DEFAULT 0);
INSERT INTO tst(locationId,scheduleCount)
values(5,0),(2,0),(5,1),(5,2),(2,1),(2,2);
I update multiple rows and multiple columns with one query, but want to change the one of the columns only for the first row and keep the other things the same for that column.
I want to update all the rows with some location id and change displayFlag to 1 and increment scheduleCount of only the top entry with 1 , rest would remain the same
**Query **
update tst,(select #rownum:=0) r,
set tst.displayFlag =1,
scheduleCount = (CASE WHEN #rownum=0
then scheduleCount+1
ELSE scheduleCount
END),
#rownum:=1 where locationId = 5
But it gives error and does not set the user defined variable rownum, I am able to join the tables in a select and change the value of the rownum, is there any other way to update the values.
I'm not sure this is the correct way of doing such a thing, but it is possible to include the user variable logic in the CASE condition:
UPDATE tst
JOIN (SELECT #first_row := 1) r
SET tst.displayFlag = 1,
scheduleCount = CASE
WHEN #first_row = 1 AND ((#first_row := 0) OR TRUE) THEN scheduleCount+1
ELSE scheduleCount
END
WHERE locationId = 5;
I have used a #first_row flag as this is more inline with your initial attempt.
The CASE works as follows:
On the first row #first_row = 1 so the second part of the WHEN after AND is processed, setting #first_row := 0. Unfortunately for us, the assignment returns 0, hence the OR TRUE to ensure the condition as a whole is TRUE. Thus scheduleCount + 1 is used.
On the second row #first_row != 1 so the condition is FALSE, the second part of the WHEN after AND is not processed and the ELSE scheduleCount is used.
You can see it working in this SQL Fiddle. Note; I have had to set the column types to TINYINT(3) to get the correct results.
N.B. Without an ORDER BY there is no guarantee as to what the '1st' row will be; not even that it will be the 1st as returned by a SELECT * FROM tst.
UPDATE
Unfortunately one cannot add an ORDER BY if there is a join.. so you have a choice:
Initialise #first_row outside the query and remove the JOIN.
Otherwise you are probably better off rewriting the query to something similar to:
UPDATE tst
JOIN (
SELECT locationId,
scheduleCount,
displayFlag,
#row_number := #row_number + 1 AS row_number
FROM tst
JOIN (SELECT #row_number := 0) init
WHERE locationId = 5
ORDER BY scheduleCount DESC
) tst2
ON tst2.locationId = tst.locationId
AND tst2.scheduleCount = tst.scheduleCount
AND tst2.displayFlag = tst.displayFlag
SET tst.displayFlag = 1,
tst.scheduleCount = CASE
WHEN tst2.row_number = 1 THEN tst.scheduleCount+1
ELSE tst.scheduleCount
END;
Or write two queries:
UPDATE tst
SET displayFlag = 1
WHERE locationId = 5;
UPDATE tst
SET scheduleCount = scheduleCount + 1
WHERE locationId = 5
ORDER BY scheduleCount DESC
LIMIT 1;
I would like to round up a value to the next nearest power of 2 in a mysql query, so
select RoundUpToNearestPowerOfTwo(700) -- Should give 1024
I need this solution as part of a much larger query to generate and read some bitmask. Using custom stored functions is not an option, since I cannot use those in our production environment, so I'm looking for a smart way to do this inline in the query.
[Edit]
One possible way I can think of, is creating some enumerator, use a power in that, and choose the smallest value larger than my value:
select
min(BOUND)
from
(select 700 as VALUE) v
inner join
(select
POW(2, #pow := #pow + 1) as BOUND
from
(select #pow := 0) x,
MY_RANDOM_TABLE t
) x on x.BOUND > v.VALUE
But as you can tell, it's pretty verbose, so a leaner solution would be welcome.
Try this.
FLOOR(POW(2,CEIL(LOG2(1025))))
The CEIL and FLOOR cope with the boundary conditions correctly.
Try this:
select power(2, 1 + floor(log2(XXX)))
MySQL conveniently has the log2() function, which does most of the work.
EDIT:
I think this may be what you want:
select (case when floor(log2(XXX)) <> log2(XXX)
then power(2, 1 + floor(log2(XXX)))
else power(2, floor(log2(XXX)))
end)
Or something like:
select power(2, 1 + floor(log2(XXX*0.999999)))
There is a boundary condition on actual powers of 2.
If you are using SQL Server then you can try this...just change value in variable #value for any value to get the next nearest power of 2
declare #count int = 1
declare #value int = 700
while (#value <> 1)
BEGIN
set #value = #value / 2
set #count = #count + 1
END
select power(2, #count)
I want to select currentrow as part of my query - I know I can loop over queries and get the currentrow variable, but I'm doing a QoQ before I use the rows and I want to keep the original rows, e.g.
//Original query
1, Audi
2, BMW
3, Skoda
//QoQ
1, Audi
3, Skoda
This is the code I've got:
q = new Query( datasource = application.db.comcar );
q.setSQL('
SELECT make, #rownum := #rownum +1 AS `rownumber`
FROM cars, ( SELECT #rownum :=0 )
LIMIT 10
');
r = q.execute().getResult();
But it's throwing the following error:
Parameter '=' not found in the list of parameters specified
SQL: SELECT make, #rownum := #rownum + 1 AS `rownumber` FROM cars, ( SELECT #rownum :=0 ) LIMIT 10
This will work in cfquery but I'd like to use it in CFScript. Is there an alternative to using := or some way of escaping this in the query.
It looks like this is a bug in Coldfusion. I could change my code to use cfquery but I'd rather not mix script and tags in my page.
So my workaround was is as follows:
/*
* based on the existing query 'tmpFields'
*/
// build array of row numbers
arrRowNumbers = [];
cntRowNumbers = tmpFields.recordCount;
for( r = 1; r <= cntRowNumbers; r++ ) {
arrayAppend( arrRowNumbers, r );
}
// add a new column with the new row number array
queryAddColumn( tmpFields, "fieldNumber", "integer", arrRowNumbers );