MySQL query help - 'count' as variable not working as expected - mysql

I have the following (fairly complex) query:
SELECT
#idx :=
CASE
WHEN #prev_paper = paper_id
THEN #idx +1
ELSE 1
END AS idx,
#prev_paper := t1.paper_id AS paper_id,
#cnt := (SELECT COUNT(DISTINCT(organization)) as dcnt from authors A INNER JOIN authors__papers AP on AP.author_id = A.author_id where AP.is_contact_author < 1 AND paper_id = #prev_paper GROUP BY paper_id) as org_count,
IF(#cnt > 1, GROUP_CONCAT('{', #idx, '}', first_name, last_name), GROUP_CONCAT(first_name, last_name)) AS names
FROM (
SELECT
AP.paper_id as paper_id, A.organization, A.first_name, A.last_name, A.country
FROM authors__papers AP
INNER JOIN authors A ON A.author_id = AP.author_id
WHERE AP.is_contact_author <1
) AS t1, (
SELECT #prev_paper := '', #idx :=0
) AS t2
GROUP BY paper_id, organization
ORDER BY paper_id, organization
And it outputs results as follows:
idx paper_id org_count names
1 5002 2 MarioIannazzo,EduardAlarcon
2 5002 2 {2}VikramPassi,{2}HimadriPandey,{2}MaxLemme
1 5003 1 {1}JiaSun
1 5004 1 Juan A.Leñero-Bardallo,AngelRodríguez-Vázquez,RicardoCarmona-Galán
1 5005 3 AlexandreVernhet,JeanCoignus
2 5005 3 {2}GerardGhibaudo
3 5005 3 {3}Jean-LucOgier,{3}GiulioTorrente,{3}DavidRoy
1 5006 1 {1}JerodMason,{1}PaulDicarlo,{1}HanchingFuh,{1}DavidWhitefield,{1}FlorinelBalteanu
1 5007 3 SivkhengKor,DavidSchwartz,JanosVeres,PingMei
2 5007 3 {2}ChristerKarlsson,{2}PerBroms
3 5007 3 {3}Tse NgaNg
...
As you can see, 'org_count' (#cnt) is not working as expected. '#idx' is not appended to the names sometimes when it should be because it's > 1 (like 5002) and is sometimes when it is not expected to be because it is = 1 (like 5003, 5006 ...). This should look like:
idx paper_id org_count names
1 5002 2 {1}MarioIannazzo,{1}EduardAlarcon
2 5002 2 {2}VikramPassi,{2}HimadriPandey,{2}MaxLemme
1 5003 1 JiaSun
1 5004 1 Juan A.Leñero-Bardallo,AngelRodríguez-Vázquez,RicardoCarmona-Galán
1 5005 3 {1}AlexandreVernhet,{1}JeanCoignus
2 5005 3 {2}GerardGhibaudo
3 5005 3 {3}Jean-LucOgier,{3}GiulioTorrente,{3}DavidRoy
1 5006 1 JerodMason,PaulDicarlo,HanchingFuh,DavidWhitefield,FlorinelBalteanu
1 5007 3 {1}SivkhengKor,{1}DavidSchwartz,{1}JanosVeres,{1}PingMei
2 5007 3 {2}ChristerKarlsson,{2}PerBroms
3 5007 3 {3}Tse NgaNg
...
It's like something seems off by 1, but I cannot for the life of me figure out what or why. Any help is appreciated!

Marking as answered. #Drew provided the link to Required Reading in his comment will hopefully lead to the solution.

I can't really tell what you want the query to do, but I think I know the problem. MySQL does not guarantee the order of evaluations of expressions in a select clause. So, variables are being assigned in some expressions and then used in others -- but the order of evaluation is unclear.
My problem in understand the query is based on things like the first column of the query is #prev_paper, but the first column of the results is labeled id.
The trick to using variables correctly is to put all the logic in a single expression.

Related

MySQL Create Ordinal

I'm trying to create a temp table or view that I can use for charting league members changing handicaps over the season. This requires creating a pivot table.
Even before that, however, I need to create an ordinal column to represent the nth match the person played that season. I can't use date_played since players can play on all different dates, and I'd like each player's 2nd match to line up vertically, and their third to do so as well, etc.
I used code from the answer to this question which seemed like it should work. Here's a copy of the relevant section:
SELECT books.*,
if( #libId = libraryId,
#var_record := #var_record + 1,
if(#var_record := 1 and #libId := libraryId, #var_record, #var_record)
) AS Ordinal
FROM books
JOIN (SELECT #var_record := 0, #libId := 0) tmp
ORDER BY libraryId;
`lwljhb_lwl_matches` # One record for each match played
id
date_played
other fields about the match
id date_played
1 2017-08-23
2 2017-08-29
3 2017-09-26
4 2017-08-24
5 2017-09-02
6 2017-09-21
7 2017-08-24
8 2017-08-31
9 2017-09-05
10 2017-09-15
11 2017-09-17
`lwljhb_users`
id
display_name
id display_name
1 Alan
2 Bill
3 Dave
`lwljhb_lwl_players` # One record per player per match
id
match_id # Foreign key to matches.id
player_id # Foreign key to users.id
end_hcp
other fields about the player's performance in the linked match
id match_id player_id end_hcp
1 1 1 720
2 2 1 692
3 3 1 694
4 4 2 865
5 5 2 868
6 6 2 842
7 7 3 363
8 8 3 339
9 9 3 332
10 10 3 348
11 11 3 374
Normally there would be two records in PLAYERS for each MATCH record, but I didn't add them here. The fact that the id and match_id are the same in every record of PLAYERS is artificial because this isn't real data.
Before I used this snippet, my code looked like this:
SELECT u.display_name,
m.date_played,
p.end_hcp
FROM `lwljhb_lwl_matches` AS m
INNER JOIN `lwljhb_lwl_players` AS p
INNER JOIN `lwljhb_users` AS u
ON m.id = p.match_id AND
p.player_id = u.id
WHERE league_seasons_id = 12 AND
playoff_round = 0
ORDER BY u.display_name, m.date_played
and generated data that looks like this:
display_name date_played end_hcp
Alan 2017-08-23 720
Alan 2017-08-29 692
Alan 2017-09-26 694
Bill 2017-08-24 865
Bill 2017-09-02 868
Bill 2017-09-21 842
Dave 2017-08-24 363
Dave 2017-08-31 339
Dave 2017-09-05 332
Dave 2017-09-15 348
Dave 2017-09-17 374
At any time during the season there will be different numbers of matches played by each of the players, but by the end of the season, they will have all played the same number of matches.
I tried to incorporate Alin Stoian's code into mine like so, changing a variable name and field names as I thought appropriate.
SET #var_record = 1;
SELECT u.display_name,
m.date_played,
p.end_hcp,
if( #player = u.display_name,
#var_record := #var_record + 1,
if(#var_record := 1 and #player := u.display_name, #var_record, #var_record)
) AS Ordinal
FROM `lwljhb_lwl_matches` AS m
INNER JOIN `lwljhb_lwl_players` AS p
INNER JOIN `lwljhb_users` AS u
JOIN (SELECT #var_record := 0, #player := 0) tmp
ON m.id = p.match_id AND
p.player_id = u.id
WHERE league_seasons_id = 12 AND
playoff_round = 0
ORDER BY u.display_name, m.date_played
I was hoping for a new column with the ordinals, but the new column is all zeros. Before I move on to trying the pivot table, I have to get these ordinals, so I hope someone here can show me my mistake.
Try this:
SELECT u.display_name,
m.date_played,
p.end_hcp,
#var_record := if ( #player = u.display_name,
#var_record + 1,
if(#player := u.display_name, 1, 1)
) AS Ordinal
FROM `lwljhb_lwl_matches` AS m
INNER JOIN `lwljhb_lwl_players` AS p
INNER JOIN `lwljhb_users` AS u
JOIN (SELECT #var_record := 0, #player := 0) tmp
ON m.id = p.match_id AND
p.player_id = u.id
WHERE league_seasons_id = 12 AND
playoff_round = 0
ORDER BY u.display_name, m.date_played
Demo
OK, it turns out that I wasn't getting all 1s in my Ordinal column after all, just in the 1st 25 records. Since it was obviously wrong I didn't look further.
The sample data I provided is in physical order as well as logical order, but MY ACTUAL data is not. That was the only difference I could think of, so I investigated further and found that in the 1st 1,000 records of output I got 7 records with 2 as the ordinal, not coincidentally where the m.ids were consecutive.
I created a view to put the data in order like in the sample and the JOINed that to the rest of the tables but still got bad data for the Ordinal.
When I swapped out the view for a temporary table it worked. It turns out that an ORDER BY in a view will be ignored if there's an ORDER by joining to it.
Bill Karwin pointed this out to me in a different question (46892912).

MySQL Winning Streak for every Player

I have a table with winner and loser statistics from a game:
id winner_id loser_id
1 1 2
2 1 2
3 3 4
4 4 3
5 1 2
6 2 1
7 3 4
8 3 2
9 3 5
10 3 6
11 2 3
12 3 6
13 2 3
I want a result table where i can find the highest winning streak of every player in the game. A streak of a player is broken, when he lost a game (player_id = loser_id). It should look like:
player_id win_streak
1 3
2 2
3 4
4 1
5 0
6 0
I tried many queries with user defined variables etc. but i can't find a solution. Thanks!
SQL Fiddle : http://sqlfiddle.com/#!9/3da5f/1
Is this the same as Alex's approach; I'm not quite sure, except that it seems to have one distinct advantage.... ;-)
SELECT player_id, MAX(CASE WHEN result = 'winner' THEN running ELSE 0 END) streak
FROM
( SELECT *
, IF(player_id = #prev_player,IF(result=#prev_result,#i:=#i+1,#i:=1),#i:=1) running
, #prev_result := result
, #prev_player:=player_id
FROM
( SELECT id, 'winner' result, winner_id player_id FROM my_table
UNION
SELECT id, 'loser', loser_id FROM my_table
) x
,
( SELECT #i:=1,#prev_result = '',#prev_player:='' ) vars
ORDER
BY x.player_id
, x.id
) a
GROUP
BY player_id;
I guess you should better to do that on php (or any other language you use) side.
But just to give you some idea and as experiment and example for some unique cases (hope it could be useful somewhere)
Here is my approach:
http://sqlfiddle.com/#!9/57cc65/1
SELECT r.winner_id,
(SELECT MAX(IF(winner_id=r.winner_id,IF(#i IS NULL, #i:=1,#i:=#i+1), IF(loser_id = r.winner_id, #i:=0,0)))
FROM Results r1
WHERE r1.winner_id = r.winner_id
OR r1.loser_id = r.winner_id
GROUP BY IF(winner_id=r.winner_id, winner_id,loser_id)) win_streak
FROM ( SELECT winner_id
FROM Results
GROUP BY winner_id
) r
It returns not all ids now but only who had ever win. So to make it better, probably you have user table. If so it would simplify a query. If you have no user table you need to union all somehow users who had never win.
You are welcome if any questions.

select last three commenter for each post (Greatest n per group)

MySQL
SELECT DISTINCT comments.commenter_id FROM comments WHERE ((oid IN (421,425)
AND otype = 'post') (oid IN (331) AND otype = 'photo')) ORDER BY
post_id,type,comment_id LIMIT 3
What i wanted to do is select last three distinct commenters for each individual post or photo with respective ids.
i.e max 3 commenters for each o_id and o_type combinations
But the above instead of yielding me last three distinct commenters yields me total three.
Where i am going wrong ? can anyone help me out ?
IF LIMIT IS 2
ID | oid | otype | commenter_id
1 1 post 1
2 1 post 1
3 1 post 2
4 1 post 3
5 2 post 1
6 1 photo 2
7 2 post 3
OUTPUT SHOULD BE
commenter_id| o_type | o_id
3 post 1
2 post 1
3 post 2
1 post 2
2 photo 1
SOLVED - GREATESET N PER GROUP
That was easy though! :P
This question helped me lot.
SELECT DISTINCT t.commenter_id,t.o_id,t.otype
from
(
SELECT c.*,
#row_number:=if(#post_id = oid, #row_number + 1, 1) AS row_number,
#oid:=oid AS varval
FROM comments c
join (select #row_number := 0, #oid:= NULL) as var
ON
((oid IN (425) AND otype = 'post') OR (oid IN (331) AND otype = 'photo'))
order by comment_id DESC
) t
where t.row_number <=2

Mysql JOIN with extra priority column

I have two days trying to do this query with no luck.
I have two tables 'DEMAND' and 'DEMAND_STATE' (one to many relation). The table DEMAND_STATE have millions entries.
CREATE TABLE DEMAND
(
ID INT NOT NULL,
DESTINY_ID INT NOT NULL
)
CREATE TABLE DEMAND_STATE
(
ID INT NOT NULL,
PRIORITY INT NOT NULL,
QUANTITY DOUBLE NOT NULL,
CASE_ID INT NOT NULL,
DEMAND_ID INT NOT NULL,
PHASE_ID INT NOT NULL
)
The QUANTITY of the DEMAND_STATE is given according to a CASE_ID and PHASE_ID. We have 'N' PHASES in 'M' CASES. Always the same number of Phases in all Cases. We always have a initial Base Quantity called 'BASE CASE' in the Case with CASE_ID = 1.
For example to obtain quantity for Case (id=2) and Case Base (id=1)
select D.*, S.PRIORITY, S.QUANTITY, S.CASE_ID, S.DEMAND_ID, S.PHASE_ID
FROM DEMAND D
join DEMAND_STATE S on (D.ID = S.DEMAND_ID)
WHERE (S.CASE_ID = 2 OR S.CASE_ID = 1)
(paste only for id=8)
ID PRIORITY QUANTITY CASE_ID DEMAND_ID PHASE_ID
8 0 85 1 8 1
8 0 83 1 8 2
8 0 88 1 8 3
8 0 89 1 8 4
8 10 85 2 8 1
8 10 84 2 8 2
8 10 86 2 8 3
8 10 89 2 8 4
We need to obtain for all Demand in 'DEMAND' only the Quantity for Each Phase with MAX priority. The idea is no duplicate DEMAND_STATE data for each new Case creation. Only create new state rows when Demand-Case-Phase is different to Case Base. This is a new project and we accept changes in model for better performance.
I also tried with the MAX calculation. This query over DEMAND_STATE works fine but only obtain data for a concrete DEMAND_ID. Further i think this solution can be so expensive.
SELECT P.ID, P.QUANTITY, P.CASE_ID, P.DEMAND_ID, P.PHASE_ID
FROM DEMAND_STATE P
JOIN (
SELECT PHASE_ID, MAX(PRIORITY) max_priority, S.DEMAND_ID
from DEMAND_STATE S
WHERE S.DEMAND_ID = 1
AND (S.CASE_ID=1 OR S.CASE_ID=2)
GROUP BY S.PHASE_ID
) SUB
ON (SUB.PHASE_ID = P.PHASE_ID AND SUB.max_priority = P.PRIORITY)
WHERE P.DEMAND_ID = 1
GROUP BY P.PHASE_ID
The result:
ID QUANTITY CASE_ID DEMAND_ID PHASE_ID
1 86 1 1 1
2 85 1 1 2
3 81 1 1 3
8 500 2 1 4
This is the result expected:
ID ID PRIORITY QUANTITY CASE_ID PHASE_ID
8 1 0 86 1 1 (data from Case Base id=1 priority 0)
8 2 10 85 1 2 (data from Case Baseid=1 priority 0)
8 3 10 81 1 3 (data from Case Base id=1 priority 0)
8 64 10 500 2 4 (data from Case id=2 priority 10)
thank for help :)
Edit:
Result of Simon proposal:
ID QUANTITY CASE_ID DEMAND_ID PHASE_ID
1 86 1 1 1
2 85 1 1 2
3 81 1 1 3
4 84 1 1 4 (this row shouldnt exist)
8 500 2 1 4 (this is the correct row)
Also would have to join it with DEMAND
#didierc response:
ID ID MAX(S.PRIORITY) QUANTITY CASE_ID PHASE_ID
1 8 10 500 2 4
2 13 10 81 2 1
2 14 10 83 2 2
2 15 10 84 2 3
3 21 10 81 2 1
4 31 10 86 2 3
4 32 10 80 2 4
4 29 10 85 2 1
4 30 10 81 2 2
we need for each DEMAND four rows with the quantity Value. In Case Base we have four quantity and in Case 2 we only change the quantity for phase 4. We need always four rows for each demand.
Database DEMAND_STATE data:
ID PRIORITY QUANTITY CASE_ID DEMAND_ID PHASE_ID
1 0 86 1 1 1
2 0 85 1 1 2
3 0 81 1 1 3
4 0 84 1 1 4
8 10 500 2 1 4
We need to obtain for all Demand in 'DEMAND' only the Quantity for Each Phase with MAX priority
I translate the above, according to your sample result set, as:
SELECT
D.ID, S.ID, MAX(S.PRIORITY), S.QUANTITY, S.CASE_ID, S.PHASE_ID
FROM DEMAND D
LEFT JOIN DEMAND_STATE S
ON D.ID = S.DEMAND_ID
GROUP BY S.PHASE_ID, S.DEMAND_ID
Update:
To get the maximum priority for each pair(demand_id,phase_id)n we use the following query:
SELECT
DEMAND_ID, PHASE_ID, MAX(PRIORITY) AS PRIORITY
FROM DEMAND_STATE
GROUP BY DEMAND_ID, PHASE_ID
Next, to retrieve the set of phases for a given demand, just make an inner join on demand state:
SELECT S.* FROM DEMAND_STATE S
INNER JOIN (
SELECT
DEMAND_ID, PHASE_ID, MAX(PRIORITY) AS PRIORITY
FROM DEMAND_STATE
GROUP BY DEMAND_ID, PHASE_ID
) S2
USING (DEMAND_ID,PHASE_ID, PRIORITY)
WHERE DEMAND_ID = 1
If you want to limit the possible cases, include a where clause in the query S2:
SELECT S.* FROM DEMAND_STATE S
INNER JOIN (
SELECT
DEMAND_ID, PHASE_ID, MAX(PRIORITY) AS PRIORITY
FROM DEMAND_STATE
WHERE CASE_ID IN (1,2)
GROUP BY DEMAND_ID, PHASE_ID
) S2
USING (DEMAND_ID,PHASE_ID, PRIORITY)
WHERE DEMAND_ID = 1
However, your comments and update indicates that MAX(PRIORITY) does not seem very relevant after all. My understanding is that you have a base case, which may be overriden by another case in a given scenario (that scenario is the pair base case + some other case). Clarify that point in your question body if this is incorrect. If that is the case, you may change the above query by replacing PRIORITY by CASE_ID:
SELECT S.* FROM DEMAND_STATE S
INNER JOIN (
SELECT
DEMAND_ID, PHASE_ID, MAX(CASE_ID) AS CASE_ID
FROM DEMAND_STATE
WHERE CASE_ID IN (1,2)
GROUP BY DEMAND_ID, PHASE_ID
) S2
USING (DEMAND_ID,PHASE_ID, CASE_ID)
WHERE DEMAND_ID = 1
The only reason I see from having a priority is if you wish to combine more than 2 cases, and use priority to select which case will prevail depending on the phase.
You may of course prepend an inner join on DEMAND to include the related demand data.
Use of subqueries should be able to do as you wish, if I understand your question correctly. Something along the lines of the following:
SELECT
P.ID,
P.QUANTITY,
P.CASE_ID,
P.DEMAND_ID,
P.PHASE_ID
FROM DEMAND_STATE P
INNER JOIN (
-- Next level up groups it down and so gets the rows first returned for each PHASE_ID, which is the highest priority due to the subquery
SELECT
D.PHASE_ID,
D.PRIORITY,
D.DEMAND_ID
FROM (
-- Top level query to get all rows and order them in desc priority order
SELECT
S.PHASE_ID,
S.PRIORITY,
S.DEMAND_ID
FROM DEMAND_STATE S
WHERE S.DEMAND_ID IN (1) -- Update this to be whichever DEMAND_IDs you are interested in
AND S.CASE_ID IN (1,2)
ORDER BY
S.PHASE_ID ASC,
S.DEMAND_ID ASC,
S.PRIORITY DESC
) D
GROUP BY
D.PHASE_ID,
S.DEMAND_ID
) SUB
ON SUB.PHASE_ID = P.PHASE_ID
AND SUB.DEMAND_ID = P.DEMAND_ID
The top level subquery exists to get the rows you are interested in and order them in an order which allows predictable results when they are then grouped down by PHASE_ID and DEMAND_ID. This in turn allows a simple INNER JOIN to DEMAND_STATE hopefully (unless I have misunderstood your query)
This may still be expensive though depending on how much data is within that top level query.

How to add rank, based on points in mysql

I am struggle with mysql query. please help me.
This is my query, i getting correct result but i need to modify the result in mysql.
SELECT bu.username,
bg.id as goal_id,
br.id as reason_id,
(SELECT COUNT(test_reason_id) FROM test_rank WHERE test_reason_id = br.id) as point
FROM
test_goal AS bg INNER JOIN test_reason AS br ON
br.user_id=bg.user_id INNER JOIN test_user AS bu ON
br.user_id=bu.id
WHERE
bg.id = br.test_goal_id
GROUP BY
bg.id
ORDER BY
point DESC
Tabble-1
My actual table look like this when i use ORDER BY point DESC then its look like Table-2
username goal_id reason_id point
khan 8 3 2
john 6 9 5
yoyo 5 21 4
smith 11 6 5
Tabble-2
My result set look like this
username goal_id reason_id point
john 6 9 5
smith 11 6 5
yoyo 5 21 4
khan 8 3 2
But i want my result set like this
username goal_id reason_id point rank
john 6 9 5 1
smith 11 6 5 2
yoyo 5 21 4 3
khan 8 3 2 4
is this possible? please can any one help me. it too difficult for me.
Add a row count variable like this:
select a.*, (#row := #row + 1) as rank
from (
SELECT bu.username,
bg.id as goal_id,
br.id as reason_id,
(SELECT COUNT(test_reason_id) FROM test_rank WHERE test_reason_id = br.id) as point
FROM
test_goal AS bg INNER JOIN test_reason AS br ON
br.user_id=bg.user_id INNER JOIN test_user AS bu ON
br.user_id=bu.id
WHERE
bg.id = br.test_goal_id
GROUP BY
bg.id
ORDER BY
point DESC
) a, (SELECT #row := 0) r
See this simplified SQLFiddle example