I am trying to write a Cuda kernel to generate row-wise histogram based on the input feature set (2 x 6) where each feature row (each having 6 features) is to generate a histogram having nbins=10.
I have implemented the below code but it doesn’t seem to generate the correct row-wise histogram.
import numba
import numpy as np
from numba import cuda
np.random.seed(0)
feature = np.random.randint(1, high=6, size=(2,6), dtype=int)
output = np.zeros(20).astype(np.float32).reshape(2,10)
### Kernal Configuration
threads_per_block = 6
blocks = 2
# moving data to device
d_feature = cuda.to_device(feature)
d_output = cuda.to_device(output)
feature_size = d_feature.shape[1]
#cuda.jit
def row_wise_histogram(feature, output, n):
xmin = np.float32(-4.0)
xmax = np.float32(4.0)
idx = cuda.grid(1)
nbins = 10
bin_width = (xmax - xmin) / nbins
for i in range(n):
# Each thread will take all the row features to generate historgram
input = feature[idx][i]
bin_number = np.int32(nbins * (np.float32(input) - np.float32(xmin)) / (np.float32(xmax) - np.float32(xmin)))
if bin_number >= 0 and bin_number < output.shape[1]:
cuda.atomic.add(output[idx], bin_number, 1)
row_wise_histogram[blocks, threads_per_block](d_feature, d_output, feature_size)
print(d_output.copy_to_host())
And the out results in
[[ 0. 0. 0. 0. 0. 0. 81111. 81111. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 162222. 0. 81111. 0.]]
which is wrong, Will appreciate it if I can get help with the issue inside the row_wise_historgram function!
I think the main issue you have in your code is that your kernel has a thread strategy to have each thread process a row, and you have 2 rows in your feature dataset, but you are launching 12 threads total:
### Kernal Configuration
threads_per_block = 6
blocks = 2
10 of those threads will be indexing out-of-bounds. For 2 rows you only need 2 threads. We can fix this multiple ways, but I will add a "thread-check" to your kernel code, to prevent out-of-bounds threads from doing anything.
You are also histogramming values that don't fit in your output array. Let's suppose your feature has a input value of 4 at some location. Let's put that value through your arithmetic:
bin_number = np.int32(nbins * (np.float32(4) - np.float32(-4)) / (np.float32(4) - np.float32(-4)))
That is 10 * (4-(-4))/(4-(-4))
So that is a bin index of 10. But you only have 10 bins, so valid bin index can only go up to 9. Which means some of your input values (e.g. 4, 5) will not be recorded in your output.
The following code is your code to add the threadcheck, plus the range of input adjusted. And I am printing out the input, the bin each input value was assigned, and the output bins. It seems to be working correctly.
$ cat t65.py
import numba
import numpy as np
from numba import cuda
np.random.seed(0)
feature = np.random.randint(1, high=4, size=(2,6), dtype=int)
output = np.zeros(20).astype(np.float32).reshape(2,10)
mybin = np.empty_like(feature)
### Kernal Configuration
threads_per_block = 6
blocks = 2
# moving data to device
d_feature = cuda.to_device(feature)
d_output = cuda.to_device(output)
feature_size = d_feature.shape[1]
d_mybin = cuda.to_device(mybin)
#cuda.jit
def row_wise_histogram(feature, output, mybin, n):
xmin = np.float32(-4.0)
xmax = np.float32(4.0)
idx = cuda.grid(1)
nbins = 10
bin_width = (xmax - xmin) / nbins
if idx < output.shape[0]:
for i in range(n):
# Each thread will take all the row features to generate historgram
input = feature[idx][i]
bin_number = np.int32(nbins * (np.float32(input) - np.float32(xmin)) / (np.float32(xmax) - np.float32(xmin)))
mybin[idx][i] = bin_number
if bin_number >= 0 and bin_number < output.shape[1]:
cuda.atomic.add(output[idx], bin_number, 1)
row_wise_histogram[blocks, threads_per_block](d_feature, d_output, d_mybin, feature_size)
print(feature)
print(d_mybin.copy_to_host())
print(d_output.copy_to_host())
$ python t65.py
[[1 2 1 2 2 3]
[1 3 1 1 1 3]]
[[6 7 6 7 7 8]
[6 8 6 6 6 8]]
[[ 0. 0. 0. 0. 0. 0. 2. 3. 1. 0.]
[ 0. 0. 0. 0. 0. 0. 4. 0. 2. 0.]]
$ cuda-memcheck python t65.py
========= CUDA-MEMCHECK
[[1 2 1 2 2 3]
[1 3 1 1 1 3]]
[[6 7 6 7 7 8]
[6 8 6 6 6 8]]
[[ 0. 0. 0. 0. 0. 0. 2. 3. 1. 0.]
[ 0. 0. 0. 0. 0. 0. 4. 0. 2. 0.]]
========= ERROR SUMMARY: 0 errors
$
Note that when I restrict the input values to 1..3, then the maximum bin index is 8 (do the math). If I increase the input range to include 4, the maximum bin index goes to 10, which "won't fit". You're correctly handling this case, but it may confuse you, as these values of 4 or 5 won't be recorded in the output. Histogram bin arithmetic is fun. You will need to work out exactly what you want.
Also note that if you run this code, you should see output almost exactly the same as above. If you don't, there is a good chance your numba or cuda install is broken somehow, and the additional run I show with cuda-memcheck will help to discover what may be the issue.
Note that since you are using atomics anyway, there isn't any particular need to assign one thread to each row, you could instead assign one thread to each input point. But that isn't your question; it's a story for another day. Conversely, if you do proceed with one thread per row, each thread doing effectively a private histogram, there is no particular need to use atomics.
I am trying to write an efficient subroutine in Fortran to find the list of all duplicate vertices in an array. The tolerance (fixed using parameter paraEps) corresponds to the radius around each point in which I look for the duplicate points. The vertices and defined for Finite Element Analysis and the range of each component will depend on the 3D geometry on which the methods will be applied (I can expect that values of the components could be varied in [-10^6,10^6] with a precision of 10^-6.
Currently the following code is running on around 150k vertices but it will be used on array of 1-3M vertices.
Example:
A array of vertices. Here I just use some small list of nodes. In my case, nodes are in a 3D space with no specific distribution.
verticesArray =
[ 0 0 0,
0 1 0
1 0 0
1 0 0
1 1 1
0 1 0
1 2 3
1 2 3
0 1.0001 0
0 1.1 0
1 2 4]
The result list (0 means the vertice has no duplicate) with a tolerance of 10^-3.
list = [ 0 1 2 2 0 1 3 3 1 0 0]
I have written a first version that works fine but remains slow especially the number of vertices is large.
Notice that startItem is an integer I can specify for starting the search not from the beginning of the array.
MWE:
module test
implicit none
contains
subroutine get_dup_nodes(nbNodes,dimP,arrayIn,nbDupNodes)
!
integer,intent(in) :: nbNodes,dimP ! number of nodes and dimension
double precision,dimension(:,:),intent(in) :: arrayIn ! list of coordinates
!
integer,intent(out) :: nbDupNodes
!
double precision,dimension(dimP) :: coorNode
integer,dimension(:),allocatable :: listDup !list of duplicates
logical :: currUnique
double precision,allocatable :: distVal
integer :: itN,itM,itF,minCurrIt,currIt,nbTmp,sizeR,startItemVal
!
double precision,parameter :: paraEps=1e-3
!
!initialize variables
allocate(listDup(nbNodes))
listDup=0
nbDupNodes=0
itF=0
distVal=1.
!
do itN=1,nbNodes
!coordinate current node
coorNode=arrayIn(itN,:)
currUnique=.true.
!search for current nodes coordinates in the list of nodes
if (listDup(itN).eq.0) then
do itM=itN+1,nbNodes
! compute distance to current point
distVal = NORM2(coorNode-arrayIn(itM,:))
! if the distance is to small then duplicate
if (distVal.le.paraEps) then
!first time it is a duplicate
if (currUnique) then
currUnique=.false.
nbDupNodes=nbDupNodes+1
listDup(itN)=nbDupNodes
endif
listDup(itM)=nbDupNodes
endif
enddo
endif
enddo
print *,listDup
end subroutine get_dup_nodes
end module test
program testb
use test
implicit none
double precision,dimension(33) :: valTmp
double precision,dimension(11,3) :: verticesArray
integer :: nbd
integer :: i,j,k
valTmp = (/ 0.,0.,0.,0.,1.,0.,1.,0.,0.,1.,0.,0.,1.,1.,1.,0.,1.,0.,1.,2.,3.,1.,2.,3.,0.,1.0001,0.,0.,1.1,0.,1.,2.,4. /)
k=1
do i=1,11
do j=1,3
verticesArray(i,j)=ValTmp(k)
k=k+1
enddo
print *,verticesArray(i,:)
enddo
call get_dup_nodes(11,3,verticesArray,nbd)
end program testb
Additional request: if you have references of books about such kind of algorithm, it could be useful.
I have a file that defines a set of tiles (used in an online game). The format for each tile is as follows:
x: 12 bits
y: 12 bits
tile: 8 bits
32 bits in total, so each tile can be expressed as a 32 bit integer.
More info about the file format can be found here:
http://wiki.minegoboom.com/index.php/LVL_Format
http://www.rarefied.org/subspace/lvlformat.html
The 4 byte structures are not broken along byte boundaries. As you can see x: and y: are both defined as 12 bits. ie. x is stored in 1.5 bytes, y is stored in 1.5 bytes and tile is stored in 1 byte.
Even though x and y use 12 bits their max value is 1023, so they could be expressed in 10 bits. This was down to the creator of the format. I guess they were just padding things out so they could use a 32-bit integer for each tile? Either way, for x and y we can ignore the final 2 bits.
I'm using a nodejs Buffer to read the file and I'm using the following code to read the values.
var n = tileBuffer.readUInt32LE(0);
var x = n & 0x03FF;
var y = (n >> 12) & 0x03FF;
var tile = (n >> 24) & 0x00ff;
This code works fine but when I read the bits themselves, in an attempt to understand binary better, I see something that confuses me.
Take, for example a int that expresses the following:
x: 1023
y: 1023
tile: 1
Creating the tiles in a map editor and reading the resulting file into a buffer returns <Buffer ff f3 3f 01>
When I convert each byte into a string of bits I get the following:
ff = 11111111
f3 = 11110011
3f = 00111111
01 = 00000001
11111111 11110011 00111111 00000001
I assume I should just take the first 12 bits as x but chop off the last 2 bits. Use the next 12 bits as y, chopping off 2 bits again, and the remaining 8 bits would be the tile.
x: 1111111111
y: 0011001111
tile: 00000001
The x is correct (1111111111 = 1023), the y is wrong (0011001111 = 207, not 1023), and tile is correct (00000001 = 1)
I'm confused and obviously missing something.
It makes more sense to look at it in this order: (this would be the binary representation of n)
00000001 00111111 11110011 11111111
On that order, you can easily do the masking and shifting visually.
The problem with what you did is that for example in 11111111 11110011, the bits of the second byte that belong to the first field are at the right (the lowest part of that byte), which in that order is discontinuous.
Also, masking with 0x03FF makes those first two fields have 10 bits, with two bits just disappearing. You can make them 12 bits by masking with 0x0FFF. As it is now, you effectively have two padding bits.
As I'm trying to fit a function to some experimental data, I've written a function with three inputs, three parameters and one output:
qrfunc = #(x, p) exp(-1*p(1)*x(:,1) - p(2)*x(:,2))+p(3)*x(:,3)+20;
When I generate some input and output values:
pS = [0.5; 0.3; 0.3];
x1 = [1 1 1; 1 1.1 1; 1 1.1 1.1; 2 1.2 2];
y1 = qrfunc(x1, pS);
And call the leasqr function:
pin =[1; 1; 1];
[f1, p1, kvg1, iter1, corp1, covp1, covr1, stdresid1, Z1, r21] = leasqr(x1, y1, pin, qrfunc, 0.0001);
This works correct, the function makes 7 iterations and provides the right outputs.
But when I load x1 from my experimental data (a text file with three columns, about 1500 lines) as well as my y1 (a text file with the same amount of lines) and run the same function, it only makes one iteration, and does not change the parameters.
It even shows that the error margins are very high:
sqrt(diag(covp1))
ans =
3.0281e+004
3.7614e+005
1.9477e-002
What am I doing wrong? There are no error messages, no 'Convergence not achieved' or anything like that...
Edit:
The data is loaded with the command:
load "input.txt"
load "output.txt"
Proof of loading:
size(input)
ans =
1540 3
The first few lines from my input file:
10 0.4 5
20 0.4 5
30 0.4 5
40 0.4 5
50 0.4 5
The second and third parameters have different values further down the line.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I love challenges like this, I'll hopefully submit my answer soon.
Which player has the best 7 card hand?
Given an unordered list of 9 cards (separated by a space), work out which player has the best poker hand. Here is a list of poker hand rankings. Example input:
2C 5H AS KS 2D 4D QD KH 3S
(ie: [[2C 5H] [AS KS] [2D 4D QD KH 3S]])
First 2 cards in the array represent player 1's hand, second 2 in the array represent player 2's hand. The last 5 cards represent the community cards, cards both players share. In effect, both players have 7 cards, and you must determine which player has the best 5 card poker hand.
A card is defined as a string, with the first character representing the card value, and the second value representing the suit. Always upper-case. No card may appear twice.
The function will calculate if the hand is a draw or a win to either player. It will ouput the totals at the end of the input. The output format is defined later on in this post.
Examples
2C 5H AS KS 2D 4D QD KH 3S
(ie: [[2C 5H] [AS KS] [2D 4D QD KH 3S]])
Player 2 wins this hand. Player 1 has a pair of 2's, player 2 has a pair of kings.
5S 6S 8H 9D 7S 8S JH TS 2H
(ie: [[5S 6S] [8H 9D] [7S 8S JH TS 2H]])
Player 1 wins this hand Player 1 has a flush, player 2 has a straight.
2S 2H AC AS 2C AH 9H TS 2D
(ie: [[2S 2H] [AC AS] [2C AH 9H TS 2D]])
Player 1 wins this hand. Player 1 has quads, player 2 has a full house
5S 6S 2D 4D 9S AS KD JC 9D
(ie: [[5S 6S] [2D 4D] [9S AS KD JC 9D]])
A draw. Both players have Ace high.
More Info
Thanks to mgroves for the following link to Project Euler which has a similar problem:
http://projecteuler.net/index.php?section=problems&id=54
Test Data
We will use the Project Euler test data:
http://projecteuler.net/project/poker.txt
Your solution should accept that text file as input, and output a total of wins and draws.
Example Output
Output must be in this format:
1: 45
2: 32
D: 12
Player 1 won 45 hands, player 2 won 32 hands, and there were 12 draws. (Not actual results)
Rules
Doesn't have to return the winning hand type, only WHO won if anyone
Card list input has no particular order
No card appears twice in the input
Input is always uppercase
Takes the Project Euler test data as an input
Outputs a count, of which player won the most hands and total draws in given format above
Perl, 414 398 370/458 344/416 char
Line breaks are not significant.
%M=map{$_,$Z++}0..9,T,J,Q,K,A;sub N{/.$/;$M{$`}.$&}
sub B{$s=#p=();
for$m(#_){$m-$_||($s+=2,++$p[$m])for#_}
#_=sort{$p[$b]-$p[$a]||$b-$a}#_;
$s=23 if$s<11&&($_[0]-$_[4]<5||$_[0]-$_[1]>8&&push#_,shift);
"#_"=~/.$/;$s+=14*(4<grep/$&/,#_);
$s=100*$s+$_ for#_;$s}
++$X{B((#c=map{N}split)[0..4])<=>B(#c[5..9])}for<>;
printf"1: %d\n2: %d\nD: %d\n",#X{1,-1,0}
This solves the "10 card" problem (10 cards are dealt, player 1 has the first 5 cards and player 2 has the second 5 cards).
The first section defines a subroutine N that can transform each card so that it has a numerical value. For non-face cards, this is a trivial mapping (5H ==> 5H) but it does transform the face cards (KC => 13C, AD => 14D).
The last section parses each line of input into cards, transforms the cards to contain numerical values, divides the cards into separate hands for the two players, and analyzes and compares those hands. Every hand increments one element of the hash %X. When all the input is parsed, %X contains the number of hands won by player 1, won by player 2, or ties.
The middle section is a subroutine that takes a set of five cards as input and produces a
12-digit number with the property that stronger poker hands will have higher-valued numbers. Here's how it works:
for$m(#_){$m-$_||($s+=2,++$p[$m])for#_}
This is the "pair" detector. If any two cards have the same numerical value, increment a hash element for one of the cards and increase the "score" variable $s by two. Note that we will end up comparing each card to itself, so $s will be at least 10 and $p[$x] will be at least one for every card $x. If the hand contains three of a kind, then those three cards will match with the other two cards -- it will be like there are 9 matches among those three cards and the "score" will be at least 18.
#_=sort{$p[$b]-$p[$a]||$b-$a}#_;
Sort the cards by (1) the number of times that card is part of a "pair" and (2) the value of the card. Thus in a hand with two 7's and two 3's, the two 7's will appear first, followed by the two 3's, followed by the kicker. In a hand with two 7's and three 3's, the three 3's will be first followed by the two 7's. The goal of this ordering is to distinguish two hands that have the same score -- a hand with a pair of 8's and a hand with a pair of 7's both have one pair, but we need to be able to tell that a pair of 8's is better.
$s=23 if$s<11&&($_[0]-$_[4]<5||$_[0]-$_[1]>8&&push#_,shift);
This line is the "straight" detector. A straight is worth 23 points and occurs when there are no pairs in the hand ($s<11 means only 5 "pairs" - each card matching with itself - were found) and either (1) the value of the highest card is exactly four more than the value of the lowest card ($_[0]-$_[4]==4), or (2) the highest value card is an Ace and the next highest card is a 5 ($_[0]-$_[1]==9), which means the hand has an A-2-3-4-5 straight. In the latter case, the Ace is now the least valuable card in the hand, so we manipulate #_ to reflect that (push#_,shift)
"#_"=~/.$/;$s+=14*(4<grep/$&/,#_);
This line is the flush detector. A flush is worth 14 more points and occurs when the last character is the same for each card. The first expression ("#_"=~/.$/) has the side effect of setting $& to the last character (the suit) of the last card in the hand. The final expression (4<grep/$&/,#_) will be true if and only if all elements of #_ have the same last character.
$s=100*$s+$_ for#_;$s}
Creates and returns a value that begins with the hand's score and then contains the values of the cards, in order of the card's importance. Scores for the various hands will be
Hand Score
---------- ------
High card 10 (each card matches itself for two points)
One pair 14 (2 additional matches)
Two pair 18 (4 additional matches)
Three of a kind 22 (6 additional matches)
Straight 23 (no pair, but 23 points for straight)
Flush 24 (no pair, but 14 additional points for the flush)
Full house 26 (8 additional matches)
4 of a kind 34 (12 additional matches)
Straight flush 37 (23 + 14 points)
which is consistent with the rules of poker. Hands with the same score can be distinguished by the values of the hand's cards, in order of importance to the hand, all the way down to the least valuable card in the hand.
The solution to the 9 card problem (two cards to player 1, two cards to player 2, the players share the next 5 cards and build their best 5 card hand) needs about 70 more strokes to choose the best 5 card hand out of the 7 cards available to each player:
%M=map{$_,$Z++}0..9,T,J,Q,K,A;sub N{/./;$M{$&}.$'}
sub A{my$I;
for$k(0..41){#d=#_;splice#d,$_,1for$k%7,$k/7;$s=#p=();
for$m(grep$_=N,#d){$m-$_||($s+=2,$p[$m]++)for#d}
#d=sort{$p[$b]-$p[$a]||$b-$a}#d;
$s=23 if$s<11&&($d[0]-$d[4]<5||$d[0]-$d[1]>8&&push#d,shift#d);
"#d"=~/.$/;$s+=14*(4<grep/$&/,#d);
$s=100*$s+$_ for#d;
$I=$s if$s>$I}$I}
++$X{A((#c=split)[0,1,4..8])<=>A(#c[2..8])}for<>;
printf"1: %d\n2: %d\nD: %d\n",#X{1,-1,0}
GolfScript - 151/187 chars
This program works on an input list of 10 cards per line, i.e. two 5 card hands.
n%0.#{3/5/{[zip~;.&,(!15*\[{n),*"TJQKA"+?}/]:|$),-4>=14*+1|{.2\?|#-,5\-.49?#*#+\.+#+\}/.16445=13*#+\]}%.~={0):0;;}{~>.!#+\#+\}if}/"1: "##n"2: "#n"D: "0
This program works on an input list of 9 cards per line, of the format described in the specifications.
n%0.#{3/.4>:D;2/2<{D+.{3/1$^.{3/1$^[zip~;.&,(!15*\[{n),*"TJQKA"+?}/]$:|),-4>=14*+1|{.2\?|#-,5\-.49?#*#+\.+#+\}/.16445=13*#+\]}%\;~}%$-1=\;}%.~={0):0;\(\}*~>.!#+\#+\}/"1: "##n"2: "#n"D: "0
Haskell: 793 796 806 826 864 904 901 880 863
Since the text file is inconsistent with 9 card hands, I'm just reading a line from the console and outputting who wins.
Bugfixes:
Ace now counts lower than a 2 in an ace-low run.
Comparing full houses fixed (again :D).
Guarantees that the best version of a given hand type is chosen. For example, if a player can choose between a 2-6 run and a 3-7 run, the 3-7 run is chosen (flushes aside).
Now shorter than the PHP solution!
Golfed:
import Data.List
(%)=mod
m=map
y=foldr1
t=0<1
z=13
w=[0,1,2,3,12]
n&x|length x<n=[]|t=take n x
b?x|b=x|t=[]
n!k= \c->e(n&m(%k)c)?(n&c)
e[]=1<1
e(x:y)=all(x==)y
k q c|any null[q c,p$c\\q c]=[]|t=q c
f=5!4
s c=(sort(m(%z)c)`elem`w:[[n..n+4]|n<-[0..8]])?c
r=3!z
p=2!z
g x y|c x y<2=x|t=y
q x(_,[])=x
q _ y=y
b h=y q$m($h)$zipWith(\t f->(,)t.y g.m(f.take 5).permutations)[1..][1!1,p,k p,r,s,f,k r,4!z,s.f]
h=reverse.a.m(%z)
a v|w\\v==[]=[-1..3]|t=sort v
c x y=o(h x)$h y
o[](_:_)=2
o[]_=0
o _[]=1
o(a:b)(k:d)|a>k=1|a<k=2|t=o b d
d n(a,k)|a==[]=0|n<1=0|r>s=1|r<s=2|f/=0=f|t=d(n-length o)(a\\o,k\\u)where(r,o)=b a;(s,u)=b k;f=c o u
i x=head.findIndices(x==)
u(n:k)c#[r,s]|n%z==i r"23456789TJQKA"&&n%4==i s"HDSC"=n|t=u k c
l c=(2&c++snd(splitAt 4c),drop 2c)
main=getLine>>=print.d 5.l.m(u[0..]).words
Ungolfed:
import Control.Exception (assert)
import Data.List (permutations, sort, intersect, findIndices, (\\))
import Data.Function (on)
(%) = mod
aceLowRun = [0,1,2,3,12]
tryTake n xs
| length xs < n = []
| otherwise = take n xs
cond ? xs
| cond = xs
| otherwise = []
eqOn n f cards = allEq (tryTake n $ map f cards) ? tryTake n cards
allEq [] = False
allEq (x:xs) = all (== x) xs
combWithPair pokerHand cards
| any null [picked1, picked2] = []
| otherwise = pokerHand cards
where
picked1 = pokerHand cards
picked2 = pair $ cards \\ picked1
straightFlush = straight . flush
quads = eqOn 4 (% 13)
fullHouse = combWithPair triples
flush = eqOn 5 (% 4)
straight cards = (sort (map (% 13) cards) `elem` runs) ? cards
where
runs = aceLowRun : [[n..n+4] | n <- [0..8]]
triples = eqOn 3 (% 13)
twoPair = combWithPair pair
pair = eqOn 2 (% 13)
single = eqOn 1 id
bestVersionOfHand [] ys = ys
bestVersionOfHand xs [] = xs
bestVersionOfHand xs ys
| compareSameRankedHands xs ys < 2 = xs
| otherwise = ys
rate rating pokerHand cards = (rating, handResult)
where
handResult = foldr1 bestVersionOfHand
(map (pokerHand . take 5) $ permutations cards)
pokerHands = zipWith rate [1..] [
single
, pair
, twoPair
, triples
, straight
, flush
, fullHouse
, quads
, straightFlush
]
bestHand hand = foldr1 (\xs ys -> if null (snd ys) then xs else ys)
(map ($ hand) pokerHands)
highestVals = reverse . arrangeVals . map (% 13)
where
arrangeVals vals = if vals `intersect` aceLowRun == aceLowRun
then [-1..3]
else sort vals
compareSameRankedHands = compareSameRankedHands' `on` highestVals
compareSameRankedHands' [] [] = 0
compareSameRankedHands' (card1:cards1) (card2:cards2)
| card1 > card2 = 1
| card1 < card2 = 2
| otherwise = compareSameRankedHands' cards1 cards2
decideWinner n cards1 cards2
| null cards1 = assert (null cards2) 0
| n < 1 = 0
| rating1 > rating2 = 1
| rating1 < rating2 = 2
| cmpRes /= 0 = cmpRes
| otherwise = decideWinner
(n - assert (length bests1 == length bests2) (length bests1))
(cards1 \\ bests1)
(cards2 \\ bests2)
where
(rating1, bests1) = bestHand cards1
(rating2, bests2) = bestHand cards2
cmpRes = compareSameRankedHands bests1 bests2
indexOf x = head . findIndices (x==)
toNum = toNum' [0..]
toNum' (n:ns) [rank, suit]
| n % 13 == indexOf rank "23456789TJQKA" && n % 4 == indexOf suit "HDSC" = n
| otherwise = toNum' ns [rank, suit]
cluster cards = (take 2 cards ++ snd (splitAt 4 cards), drop 2 cards)
main = getLine >>= print
. uncurry (decideWinner 5)
. cluster
. map toNum
. words
GolfScript 258 241 247/341 217/299 char
Solution for the 10 card problem. Only the last couple of newlines are significant:
10:T):J):Q):K):A;0:a;0:b;0:d;"\r\n"%{' '/5/{.{)\;}/4*-+++!:f;{);~}%{$0:z(%{.z-
!99*+:z}%}5*.{+}*99/:P!{..)\(#4+-!2*\;\.2<~9+-!\;+}and:s;[s f*6P=4P=f s P 6$]\;}
%.~={;;d):d;}{~>{a):a;}{b):b;}if}if}/
'1: 'a'
2: 'b'
D: 'd n
The 9 card problem currently needs about 80 more characters.
10:T):J):Q):K):A;0:a;0:b;0:d;"\r\n"%{' '/);{('Z'%+}2*[0$2>\7<]
{:H;7,{H=:x;H{x=!},:I;6,{I=:x;I{x=!},}/}%{.{)\;}/4*-+++!:f;
{);~}%{$0:z(%{.z-!99*+:z}%}5*.{+}*99/:P!{..)\(#4+-!2*\;\.2<~9+-!\;+}and:s;[
s f*6P=4P=f s P 6$]\;}%{[\].~>{~;}{~\;}if}*}%.~={;;d):d;}{~>{a):a;}{b):b;}if}if}/
'1: 'a'
2: 'b'
D: 'd n
Less golfed version of 10 card problem.
10:T;11:J;12:Q;13:K;14:A; # map for face cards
0:a;0:b;0:d; # other initialization
"\r\n"% # split input on \n
{ # on each line of input
' '/ # divide line into ten cards
5/ # split into five card hands
{. # on each of the two hands
{)\;}% # chop last character of each card
.(5*\;\{+}*= # check sum of elem == 5*1st elem
:f; # this is the flush flag
{);~}%$ # reduce cards to numerical values
0:z;{.z- 20%{}
{;z 20+}if:z}%{-1*}$ # detect pairs
.(:h;; # extract value of highest card
20h>{..)\(#4+-!2*\;\ # detect straight
.2<~9+-!\;+}and:s; # s=2 for regular straight, s=1 for A-5 straight
# result of this mapping - 6 elem array
[ 0$ # #6 - cards in the hand
.{20/}%{+}*:P # #5 - number of pairs
s # #4 - is this a straight?
f # #3 - is this a flush?
4P= # #2b - is this a full house?
h 59> # #2 - is this 4 of a kind?
s f * # #1 - is this a straight flush?
]-1%
\;
}/
\.#.# # put [hand1 hand2 hand1 hand2] on stack
= # check hand1==hand2
{;;d):d;} # if equal, increment d (draw)
{>{a):a;} # if >, increment a (player 1 wins)
{b):b;}if # if <, increment b (player 2 wins)
}if
}/
# output results
'1: 'a'
2: 'b'
D: 'd n
C, 665+379 chars
Here's my answer in 2 parts.
The first is a complete 7 card evaluator, including the "AddCard" macro A. It returns a 32-bit number ranking the hand. The high nibble is the type, bits 13..25 indicate the high card(s) and bits 0..12 indicate the kicker(s). When comparing the results, the better hand will always have the larger value.
#define U unsigned
#define c(a)X=a;i=C=0;while(X){C|=(X&1)<<i++;X/=4;}
#define A(h,c)h[c&7]+=c,h[3]|=c
U C,i,X;
U E(U h[]){
U a=h[0]+h[1]+h[2]+h[4]-(h[3]&-16),t,v,k,e=a&0x55555540,o=a&0xAAAAAA80;
if(v=e&o/2){t=7;k=h[3]^v;i=0;while(k/=4)i++;k=1<<2*i;}
else if(v=o&o-1){t=6;v/=2;k=o/2^v;}
else if(e>1&o>1){t=6;v=o/2;k=(i=e&e-1)?i:e;}
else{a=h[3];
if(t=h[i=1]-(a&1)&4||h[i=2]-(a&2)&8||h[i=4]-(a&4)&16||h[i=0]-(a&8)&32)a=h[i];
a&=-64;v=a|a>>26&16;t*=5;
if(v=v&v<<2&v<<4&v<<6&v<<8){t+=4;a=v&=~(v/2);}
else if(t)for(i=(h[i]&63)/(i?i:8),v=a;i-->5;)a&=a-1;
else if(v=o/2)t=3;
else if (e){o=e&e-1;v=(i=o&o-1)?o:e;t=1+(o>0);}
k=a^v;k&=k-1;k&=k-(i==0);}
c(v);v=C/8;c(k);
return t<<28|v<<13|C/8;}
The second is the input processor. It parses the project Euler file as 2+2+5 cards (ignoring the 10th card). It uses the Parse macro, P to create 32-bit values representing each card. The representation is 0A0K0Q0J0T090807060504030200shdc. A hand is stored as an array of 5 ints.
char*gets(char*);char*strchr(char*,char);
#define P(c)X=strchr(R,*c++)-R;C=1<<strchr(S,*c++)-S|64<<X*2;c++;
#define L(n)for(i=0;i<n;i++)
U g[5],h[5];
char*c,b[32];
char*S="CDHS";
char*R="23456789TJQKA";
int d,r[3]={0};
main(q){while(c=gets(b)){
L(2){P(c)A(g,C);}
L(2){P(c)A(h,C);}
L(5){P(c)A(g,C);A(h,C);}
d=E(g)-E(h);
r[d>0?0:d<0?1:2]++;
L(7)g[i]=h[i]=0;
}L(3)printf("%c:%d\n","12D"[i],r[i]);}
I'm sure there are a few more characters to be trimmed off. I'll add an explanation soon.
The evaluator runs #17.6 Million hands/second on my 3Ghz Core2 Duo. That's only 3.5x slower than the PokerSource evaluator, which uses at least 56K of lookup tables.
PHP, 799 chars
Line breaks are not significant. This takes input from the linked url, which is different from the example input (doesn't deal with community cards). Processing is similar to mobrule's perl answer, with a different scoring method.
<?php
function s($i){$o=array_map('intval',$i);$f=(count(array_unique(str_replace($o,'',$i)))==1);
sort($o);$v=array_count_values($o);arsort($v);$u=array_keys($v);$h=max($u);$m=$u[0];$c=reset($v);
$p=count($v);$e=$c==1&&$o[4]==14&&$o[3]==5;$r=$o==range($o[0],$o[0]+4)||$e;$q=$e?5:$h;
$s=($f&&$r&&($h==12)?2<<11:($f&&$r?(2<<10)+$q:0))+($c==4?(2<<9)+$m:0)+($c==3&&$p==2?(2<<8)+$m:0)+($f?(2<<7)+$h:0)+
($r?(2<<6)+$q:0)+($c==3?(2<<5)+$m:0)+($c==2&&$p==3?(2<<4)+$m:0)+($p==4?(2<<3)+$m:0);$s+=!$s?$h:0;return array($s,$u);}
foreach(file($argv[1]) as $d){
list($y,$z)=array_chunk(explode(' ',trim(strtr($d,array('T'=>10,'J'=>11,'Q'=>12,'K'=>13,'A'=>14)))),5);
$y=s($y);$z=s($z);$w=$y[0]-$z[0];$x=1;while(!$w&&$x<5){$w=$y[1][$x]-$z[1][$x++];}if(!$w)#$t++;elseif($w<0)#$l++;else #$k++;}
#print "1: $k\n2: $l\nD: $t";