In Business Objects XI Web Intelligence the Rank function returns dense results. For example when ranking by "Amount" I want to return the top ten records only. However three records tie for 5th place on "Amount". Result is a total of 12 records: one each for places 1 to 4 and 6 to 10 and 3 records for 5th place.
Desired result is a "sparse" top ten that drops the two lowest ranked records (places 9 and 10).
I tried to do this and rank customers by amount.
I have 2 objects: [Amount] and [Customernumber].
[Customernumber] is numeric.
I created a new variable:
[varForSorting]=[Amount]*10000000+ToNumber([Customernumber])
Then I rank by the new variable [varForSorting].
Customers with the same Amount will be sorted in Alphabetic order by Customer number. I hope this helps.
Here is an example of how I solved it for a change in Account Count over time. This approach allows you to break your dense rank ties using other measures in your data provider. Basically you use multiple measures in one rank and decide which measure to rank by first, second, etc:
Step 1: Determine the change amount
v_Account_Count_Delta_Amount
=([v_Account_Count_After] - [v_Account_Count_Before])
Step 2: Rank the change amounts (this is where ties and dense rank cause multiple rows to be returned)
v_Account_Count_Delta_Amount_Rank
=NoFilter(Rank([v_Account_Count_Delta_Amount]))
Step 3: Compute the tie breaking rank using other measures
v_MonthToDateMeasuresRank
=NoFilter(Rank([Month To Date Sva]+ [Bank Share Balance] + [Total Commitment]))
Step 4: Compute a combined rank that is now free from ties and weight your ranks however you choose
v_Account_Count_Combined_Rank
=Rank([v_Account_Count_Delta_Amount_Rank]* 1000000 + [v_MonthToDateMeasuresRank];Bottom)
Step 5: Filter your data block for v_Account_Count_Combined_Rank <= 10
Ultimately depending on your data it could still result in a tie unless you take the additional step of ranking by some other unique attribute that you can turn to a number (see Maria Ruchko's answer for that bit of magic using Customer Number). I tried to do that with RowIndex() and LineNumber() but could not get usable results. My measures when added together happen to never tie so this works for my specific data blob.
Related
There are 5 tables: mlb_batting, mlb_manager, mlb_master, mlb_pitching, mlb_team.
Find the top 10 (highest) “strike outs per walk” statistic for all pitches with at least 1 walk that played in at least 25 games. You should display their first name, last name, and K/BB statistic. K/BB is computed by dividing the number of strike outs by the number of walks (“base on balls”). You will need to use “limit” in MySQL (not talked about in class or notes – you will have to search how to do it). I would like this query done 2 different ways. One that only looks at the 25 games and 1 walk on a per stint basis. That is, if they played for two different teams (two different stints) then you would count those separate. And the other query should combine all the stints they had. That is, if they played for two different teams you would add up their games and walks.
My solution is:
SELECT NAME_FIRST, NAME_LAST, SUM(strikeouts) / SUM(walks) AS KS_PER_BB
FROM mlb_master
JOIN mlb_pitching
ON mlb_master.player_id = mlb_pitching.player_id
WHERE walks >= 1 AND games >= 25
GROUP BY name_first, name_last, mlb_pitching.stint
ORDER BY KS_PER_BB DESC
LIMIT 10;
I am wondering if this solution is better for the first way my professor wants it done or the second way, if any.
This solution is appropriate for the first query because by having GROUP BY stint, each stint is considered different for each player.
For the second way, could I remove the stint column from the GROUP BY clause so that it groups the records for a particular player together, regardless of the different stints they played for?
Would this result in the sum of all their walks and strikeouts from all their stints being used to calculate the KS_PER_BB statistic, giving you the combined total for each player?
I'm trying to build a reporting table to track server traffic and popularity overall. Each SID is a unique game server hosting a particular game, and each UCID is a unique player key connecting to that server.
Say I have a table like so:
SID UCID AvgTime NumConnects
-----------------------------------------
1 AIE9348ietjg 300.55 5
1 Po328gieijge 500.66 7
2 AIE9348ietjg 234.55 3
3 Po328gieijge 1049.88 18
We can see that there are 2 unique players, and 3 unique servers, with SID 1 having 2 players that have connected to it at some point in the past. The AvgTime is the average amount of time those players spent on that server (in seconds), and the NumConnects is the size of the average (ie. 300.55 is averaged out of 5 elements).
Now I run a job in the background where I process a raw connection table and pull out player connections like so:
SID UCID ConnectTime DisconnectTime
-----------------------------------------
1 AIE9348ietjg 90.35 458.32
2 Po328gieijge 30.12 87.15
2 AIE9348ietjg 173.12 345.35
This table has no ID or other fluff to help condense my example. There may be multiple connect/disconnect records for multiple players in this table. What I want to do is add to my existing AvgTime for each SID these new values.
There is a formula from here I am trying to use (taken from this math stackexchange: https://math.stackexchange.com/questions/1153794/adding-to-an-average-without-unknown-total-sum/1153800#1153800)
Average = (Average * Size + NewValue) / Size + 1
How can I write an update query to update each ServerIDs traffic table above, and add to the average using the above formula for each pair of records. I tried something like the following but it didn't work (returned back null):
UPDATE server_traffic st
LEFT JOIN connect_log l
ON st.SID = l.SID AND st.UCID = l.UCID
SET AvgTime = (AvgTime * NumConnects + SUM(l.DisconnectTime - l.ConnectTime) / NumConnects + COUNT(l.UCID)
I would prefer an answer in MySql, but I'll accept MS SQL as well.
EDIT
I understand that statistics and calculations are generally not to be stored in tables and that you can run reports that would crunch the numbers for you. My requirement is that users can go to a website and view the popularity of various servers. This needs to be done in a way that
A: running a complex query per user doesn't crash or slow down the system
B: the page returns the data within a few seconds at most
See this example here: https://bf4stats.com/pc/shinku555555
This is a web page for battlefield 4 stats - notice that the load is almost near instant for this player, and I get back a load of statistics without waiting for some complex report query to return the data. I'm assuming they store these calculations in preprocessed tables where the webpage just needs to do a simple select to return back the values. That's the same approach I want to take with my Database and Web Application design.
Sorry if this is off topic to the original question - but hopefully this adds additional context that helps people understand my needs.
Since you cannot run aggregate functions like SUM and COUNT by themselves at the unit level in SQL but contained in an aggregate query, consider joining to an aggregate subquery for the UPDATE...LEFT JOIN. Also, adjust parentheses in SET to match above formula.
Also, note that since you use LEFT JOIN, rows with non-match IDs will render NULL for aggregate fields and this entity cannot be used in arithmetic operations and will return NULL. You can convert to zero with IFNULL() but may fail with formula's division.
UPDATE server_traffic s
LEFT JOIN
(SELECT SID, UCID, COUNT(UCID) As GrpCount,
SUM(DisconnectTime - ConnectTime) AS SumTimeDiff
FROM connect_log
GROUP BY SID, UCID) l
ON s.SID = l.SID AND s.UCID = l.UCID
SET s.AvgTime = (s.AvgTime * s.NumConnects + l.SumTimeDiff) / s.NumConnects + l.GrpCount
Aside - reconsider saving calculations/statistics within tables as they can always be run by queries even by timestamps. Ideally, database tables should store raw values.
I use mysql select some data for player. and the result is a list while I just want a random one.
the following sql is syntax error after limit 1
select * from tb_rank where score<=150 and score>= 50 and power>=80 and power<=120
limit 1,(select round(rand()*(select count(*) as num from tb_rank where score<=150 and score>= 50 and power>=80 and power<=120)))
50.000 people are a lot if you have them in front of you. But you are talking about crunching numbers on a computer. Here 50.000 is nothing.
Sorting would just take additional time and it is not necessary as you want a random player that has your score ±50 and your power ±20%. A random player of a sorted list is still a random player. It wouldn't make any difference.
Iterate over your playerlist, build a new list of players that have a valid score and power. Then pick a random element of that new list.
On my average laptop this takes less than 5 microsconds.
I have a regular Table in SSRS. With 3 Groups...
(Parent) STORE - CLERK - PRODUCT (Child)
I have some regular aggregations. How many PRODUCTS Sold by a CLERK , How Many CLERKS Per STORE and Eventually How many PRODUCTS Per STORE
On top of the Regular Sums And Avgs, I need To Find Out The Percentage of PRODUCT (Type) Meaning a Particular value of that Group.
Example STORE 001 Has Sold 10 RADIOS (a PRODUCT) and There has Been 100 RADIOS sold by all Stores
So Basically What I Need is to show STORE 001 is Responsible for 10% of all RADIO Sales.
(A note: Ideally , I would Like to show this To adjust to the Data - So if I add new products It will group those as products (Naturally) but still give me those percentages)
= fields!product.value / sum(fields!product.value)
in its most basic form you would want to use something like this.
The first will give you the total of the current row of data and the second will give you the total of all rows of that product.
Thus you would have 10 / 100 (per your example).
This is assuming that you have your data structured correctly. Depending on the structure of you report you may need to add a scope to your total summation to make sure that you are not totaling any other datasets that may reference the same product or field.
sum(fields!product.value, "--your dataset here--")
I'm currently trying to do a leader board based on 4 metrics, i kept it simple doing Select TOP (3) from my table ordering by largest to smallest. Then added in a ROW_NUMBER() to give the 1, 2, 3. Then aggregating it up i created a case when to assign 100 to No1, 50 to No2 and 25 to No3.
Issue I've got is I get a tie for some of the scores. it then uses the next field to sort and drops one down to 50 points when I want joint to both have 100. Hope i've explained this OK?
I'm sure there's a smarter way of doing this?