how to ranking csv file in netlogo - csv

I have a csv file
It contains a set of values, and I try to rank you for these values , as in Excel(n competition ranking, items that compare equal receive the same ranking number, and then a gap is left in the ranking numbers. The number of ranking numbers that are left out in this gap is one less than the number of items that compared equal. Equivalently, each item's ranking number is 1 plus the number of items ranked above it. This ranking strategy is frequently adopted for competitions, as it means that if two (or more) competitors tie for a position in the ranking, the position of all those ranked below them is unaffected (i.e., a competitor only comes second if exactly one person scores better than them, third if exactly two people score better than them, fourth if exactly three people score better than them, etc.).
Thus if A ranks ahead of B and C (which compare equal) which are both ranked ahead of D, then A gets ranking number 1 ("first"), B gets ranking number 2 ("joint second"), C also gets ranking number 2 ("joint second") and D gets ranking number 4 ("fourth").
Set ranking[0] to 1.
For each index i in the score-list
If score[i] equals score[i-1] they should have same ranknig:
ranking[i] = ranknig[i-1]
else the ranking should equal the current index:
ranking[i] = i + 1
(+ 1 due to 0-based indecies and 1-based ranking)
and I tried some code to do
'` extensions [csv]
globals [data variable]
turtles-own [var ]
to setup
file-close-all
file-open "test1.csv"
;; read the data all at once by using csv:from-file
set data csv:from-file "test1.csv"
reset-ticks
end
to ranking
if file-at-end? [stop]
;; extract value from the list, using item 0 to remove the list, and just keep the value
set variable item 0 item ticks data
if ticks = length data [stop]
let rank-list sort-on [variable] turtles
let ranks n-values length rank-list [ ]
(foreach rank-list ranks [ask ?1 [set variable 2] ] )
tick
;show variable
end`
but I did not succeed with Netlogo (6.1)

Related

I want know the rank in a Django queryset

I have a product list model and would like to know the ranking of a specific price of this model.
sorted_product_list = Product.objects.all().order_by('-price')
my_product = {'id': 10, 'price': 20000}
django has RowNum class but it is not support for mysql
i have only one idea that use enumerate
for rank, element in enumerate(sorted_product_list):
if element.id == my_product.id:
my_product_rank = rank
Is there any other solution?
We can obtain the rank by Counting the number of Products with a higher price (so the ones that would have come first), so:
rank = Product.objects.filter(price__gt=myproduct['price']).count()
Or in case we do not know the price in advance, we can first fetch the price:
actual_price = Product.objects.values_list('price', flat=True).get(id=myproduct['id'])
rank = Product.objects.filter(price__gt=actual_price).count()
So instead of "generating" a table, we can filter the number of rows above that row, and count it.
Note that in case multiple Products have the same price, we will take as rank the smallest rank among those Products. So if there are four products with prices $ 200, $ 100, $100, and $ 50, then both Products with price $ 100 will have rank 1. The Product that costs $ 50 will have rank 3. In some sense that is logical, since there is no "internal rank" among those products: the database has the freedom to return these products in any way it wants.
Given there is an index on the price column (and it is a binary tree), this should work quite fast. The query will thus not fetch elements from the database.
In case the internal rank is important, we can use an approach where we first determine the "external rank", and then iterate through Products with the same price to determine the "internal rank", but note that this does not make much sense, since between two queries, it is possible that this "internal order" will change:
# rank that also takes into account *equal* prices, but *unstable*
actual_price = Product.objects.values_list('price', flat=True).get(id=myproduct['id'])
rank = Product.objects.filter(price__gt=actual_price).count()
for p in Product.objects.filter(price=actual_price):
if p.id != myproduct['id']:
rank += 1
else:
break
we thus keep incrementing while we have not found the product, in case we have, we stop iterating, and have obtained the rank.

Need a different permutation of groups of numbers

I have numbers from 1 to 36. What I am trying to do is put all these numbers into three groups and works out all various permutations of groups.
Each group must contain 12 numbers, from 1 to 36
A number cannot appear in more than one group, per permutation
Here is an example....
Permutation 1
Group 1: 1,2,3,4,5,6,7,8,9,10,11,12
Group 2: 13,14,15,16,17,18,19,20,21,22,23,24
Group 3: 25,26,27,28,29,30,31,32,33,34,35,36
Permutation 2
Group 1: 1,2,3,4,5,6,7,8,9,10,11,13
Group 2: 12,14,15,16,17,18,19,20,21,22,23,24
Group 3: 25,26,27,28,29,30,31,32,33,34,35,36
Permutation 3
Group 1: 1,2,3,4,5,6,7,8,9,10,11,14
Group 2: 12,11,15,16,17,18,19,20,21,22,23,24
Group 3: 25,26,27,28,29,30,31,32,33,34,35,36
Those are three example, I would expect there to be millions/billions more
The analysis that follows assumes the order of groups matters - that is, if the numbers were 1, 2, 3 then the grouping [{1},{2},{3}] is distinct from the grouping [{3},{2},{1}] (indeed, there are six distinct groupings when taking from this set of numbers).
In your case, how do we proceed? Well, we must first choose the first group. There are 36 choose 12 ways to do this, or (36!)/[(12!)(24!)] = 1,251,677,700 ways. We must then choose the second group. There are 24 choose 12 ways to do this, or (24!)/[(12!)(12!)] = 2,704,156 ways. Since the second choice is already conditioned upon the first we may get the total number of ways of taking the three groups by multiplying the numbers; the total number of ways to choose three equal groups of 12 from a pool of 36 is 3,384,731,762,521,200. If you represented numbers using 8-bit bytes then to store every list would take at least ~3 pentabytes (well, I guess times the size of the list, which would be 36 bytes, so more like ~108 pentabytes). This is a lot of data and will take some time to generate and no small amount of disk space to store, so be aware of this.
To actually implement this is not so terrible. However, I think you are going to have undue difficulty implementing this in SQL, if it's possible at all. Pure SQL does not have operations that return more than n^2 entries (for a simple cross join) and so getting such huge numbers of results would require a large number of joins. Moreover, it does not strike me as possible to generalize the procedure since pure SQL has no ability to do general recursion and therefore cannot do a variable number of joins.
You could use a procedural language to generate the groupings and then write the groupings into a database. I don't know whether this is what you are after.
n = 36
group1[1...12] = []
group2[1...12] = []
group3[1...12] = []
function Choose(input[1...n], m, minIndex, group)
if minIndex + m > n + 1 then
return
if m = 0 then
if group = group1 then
Choose(input[1...n], 12, 1, group2)
else if group = group2 then
group3[1...12] = input[1...12]
print group1, group2, group3
for i = i to n do
group[12 - m + 1] = input[i]
Choose(input[1 ... i - 1].input[i + 1 ... n], m - 1, i, group)
When you call this like Choose([1...36], 12, 1, group1) what it does is fill in group1 with all possible ordered subsequences of length 12. At that point, m = 0 and group = group1, so the call Choose([?], 12, 1, group2) is made (for every possible choice of group1, hence the ?). That will choose all remaining ordered subsequences of length 12 for group2, at which point again m = 0 and now group = group2. We may now safely assign group3 to the remaining entries (there is only one way to choose group3 after choosing group1 and group2).
We take ordered subsequences only by propagating the index at which to begin looking on the recursive call (minIdx). We take ordered subsequences to avoid getting permutations of the same set of 12 items (since order doesn't matter within a group).
Each recursive call to Choose in the loop passes input with one element removed: precisely that element that just got added to the group under consideration.
We check for minIndex + m > n + 1 and stop the recursion early because, in this case, we have skipped too many items in the input to be able to ever fill up the current group with 12 items (while choosing the subsequence to be ordered).
You will notice I have hard-coded the assumption of 12/36/3 groups right into the logic of the program. This was done for brevity and clarity, not because you can't make parameterize it in the input size N and the number of groups k to form. To do this, you'd need to create an array of groups (k groups of size N/k each), then call Choose with N/k instead of 12 and use a select/switch case statement instead of if/then/else to determine whether to Choose again or print. But those details can be left as an exercise.

"Sparse" Rank in Business Objects XI Web Intelligence?

In Business Objects XI Web Intelligence the Rank function returns dense results. For example when ranking by "Amount" I want to return the top ten records only. However three records tie for 5th place on "Amount". Result is a total of 12 records: one each for places 1 to 4 and 6 to 10 and 3 records for 5th place.
Desired result is a "sparse" top ten that drops the two lowest ranked records (places 9 and 10).
I tried to do this and rank customers by amount.
I have 2 objects: [Amount] and [Customernumber].
[Customernumber] is numeric.
I created a new variable:
[varForSorting]=[Amount]*10000000+ToNumber([Customernumber])
Then I rank by the new variable [varForSorting].
Customers with the same Amount will be sorted in Alphabetic order by Customer number. I hope this helps.
Here is an example of how I solved it for a change in Account Count over time. This approach allows you to break your dense rank ties using other measures in your data provider. Basically you use multiple measures in one rank and decide which measure to rank by first, second, etc:
Step 1: Determine the change amount
v_Account_Count_Delta_Amount
=([v_Account_Count_After] - [v_Account_Count_Before])
Step 2: Rank the change amounts (this is where ties and dense rank cause multiple rows to be returned)
v_Account_Count_Delta_Amount_Rank
=NoFilter(Rank([v_Account_Count_Delta_Amount]))
Step 3: Compute the tie breaking rank using other measures
v_MonthToDateMeasuresRank
=NoFilter(Rank([Month To Date Sva]+ [Bank Share Balance] + [Total Commitment]))
Step 4: Compute a combined rank that is now free from ties and weight your ranks however you choose
v_Account_Count_Combined_Rank
=Rank([v_Account_Count_Delta_Amount_Rank]* 1000000 + [v_MonthToDateMeasuresRank];Bottom)
Step 5: Filter your data block for v_Account_Count_Combined_Rank <= 10
Ultimately depending on your data it could still result in a tie unless you take the additional step of ranking by some other unique attribute that you can turn to a number (see Maria Ruchko's answer for that bit of magic using Customer Number). I tried to do that with RowIndex() and LineNumber() but could not get usable results. My measures when added together happen to never tie so this works for my specific data blob.

Calculating the cost of Block Nested Loop Joins

I am trying to calculate the cost of the (most efficient) block nested loop join in terms of NDPR (number of disk page reads). Suppose you have a query of the form:
SELECT COUNT(*)
FROM county JOIN mcd
ON count.state_code = mcd.state_code
AND county.fips_code = mcd.fips_code
WHERE county.state_code = #NO
where #NO is substituted for a state code on each execution of the query.
I know that I can derive the NPDR using: NPDR(R x S) = |Pages(R)| + Pages(R) / B - 2 . |P ages(S)|
(where the smaller table is used as the outer in order to produce less page reads. Ergo:
R = county, S = mcd).
I also know that Page size = 2048 bytes
Pointer = 8 byte
Num. rows in mcd table = 35298
Num. rows in county table = 3141
Free memory buffer pages B = 100
Pages(X) = (rowsize)(numrows) / pagesize
What I am trying to figure out is how the "WHERE county.state_code = #NO" affects my cost?
Thanks for your time.
First a couple of observations regarding the formula you wrote:
I'm not sure why it you write "B - 2" instead of "B - 1". From a theoretical perspective, you need a single buffer page to read in relation S (you can do it by reading one page at a time).
Make sure you use all the brackets. I would write the formula as:
NPDR(R x S) = |Pages(R)| + |Pages(R)| / (B-2) * |Pages(S)|
The all numbers in the formula would need to be rounded up (but this is nitpicking).
The explanation for the generic BNLJ formula:
You read in as many tuples from the smaller relation (R) as you can keep in memory (B-1 or B-2 pages worth of tuples).
For each group of B-2 pages worth of tuples, you then have to read the whole S relation ( |Pages(S)|) to perform the join for that particular range of relation R.
At the end of the join, relation R is read exactly one time and relation S is read as many times as we filled the memory buffer, namely |Pages(R)| / (B-2) times.
Now the answer:
In your example a selection criteria is applied to relation R (table Country in this case). This is the WHERE county.state_code = #NO part of the query. Therefore, the generic formula does not apply directly.
When reading from relation R (i.e., table Country in your example), we can discard all the non-qualifying tuples that do not match the selection criteria. Assuming that there are 50 states in the USA and that all states have the same number of counties, only 2% of the tuples in table Country qualify on average and need to be stored in memory. This reduces the number of iteration of the inner loop of the join (i.e., the number of times we need to scan relation S / table mcs). The 2% number is obviously just the expected average and will change depending on the actual given state.
The formula for your problem therefore becomes:
NPDR(R x S) = |Pages(County)| + |Pages(County)| / (B - 2) * |Counties in state #NO| / |Rows in table County| * |Pages(Mcd)|

Grade Up/Down APL order

How come that
⌽(⍒'Hello')
is
1 2 4 3 5
when
⍋'Hello'
is
1 2 3 4 5
?
I'm new to APL and stumbled on it by accident. I just wonderes why the second l comes before the first.
You are using both the grade up ⍋ and grade down ⍒ as monadic primitives.
By definition grade up returns an integer array of indices which specify the sorted order of the expression following it, in ascending order. If any elements are equal (in your example the two letter l's) , they will appear in the result in the same order that they appeared in the input expression.
So, ⍋'Hello' returns 1 2 3 4 5. The two l's are in the same order, i.e., the 3rd character (1st letter l) precedes the 4th character (2nd letter l).
By definition grade down also returns an integer array of indices which specify the sorted order of the expression following it, in descending order. If any elements are equal (in your example the two letter l's) , they will also appear in the result in the same order that they appeared in the expression.
So, ⍒'Hello' returns 5 3 4 2 1. The two l's remain in the same order because they are equal.
When you apply rotate ⌽ the integer array gets reversed to 1 2 4 3 5 as you witnessed.
The outcome you are seeing is precisely what is expected given the way the functions are defined and how they deal with equal values.
If you want to see a more extreme example compare the output for the following two arrays. Create an array with 10 elements each having the same value of 1. 10⍴1 and then try the grade up function and then try the grade down function:
⍋10⍴1
and
⍒10⍴1
They will both yield the same result:
1 2 3 4 5 6 7 8 9 10
The grade up ⍋ and grade down ⍒ primitives preserve the order of equal elements. As others have said, there must be a rule for equal arguments. But this rule has the virtue that it allows multi-key sorts.
That is, if you have an array with several associated keys, by sorting on each key from least significant to most significant, you obtain a result sorted by the most significant key, with equals sorted by the 2nd mot significant, items equal on the 1st two sorted by the 3rd, and so on. For this to work the index vector must be captured and used to update all keys and the data to keep them in sync. Or they could be stored in a nested structure, in which case they would automatically be kept in proper relative order.