I am trying to write a Cuda kernel to generate row-wise histogram based on the input feature set (2 x 6) where each feature row (each having 6 features) is to generate a histogram having nbins=10.
I have implemented the below code but it doesn’t seem to generate the correct row-wise histogram.
import numba
import numpy as np
from numba import cuda
np.random.seed(0)
feature = np.random.randint(1, high=6, size=(2,6), dtype=int)
output = np.zeros(20).astype(np.float32).reshape(2,10)
### Kernal Configuration
threads_per_block = 6
blocks = 2
# moving data to device
d_feature = cuda.to_device(feature)
d_output = cuda.to_device(output)
feature_size = d_feature.shape[1]
#cuda.jit
def row_wise_histogram(feature, output, n):
xmin = np.float32(-4.0)
xmax = np.float32(4.0)
idx = cuda.grid(1)
nbins = 10
bin_width = (xmax - xmin) / nbins
for i in range(n):
# Each thread will take all the row features to generate historgram
input = feature[idx][i]
bin_number = np.int32(nbins * (np.float32(input) - np.float32(xmin)) / (np.float32(xmax) - np.float32(xmin)))
if bin_number >= 0 and bin_number < output.shape[1]:
cuda.atomic.add(output[idx], bin_number, 1)
row_wise_histogram[blocks, threads_per_block](d_feature, d_output, feature_size)
print(d_output.copy_to_host())
And the out results in
[[ 0. 0. 0. 0. 0. 0. 81111. 81111. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 162222. 0. 81111. 0.]]
which is wrong, Will appreciate it if I can get help with the issue inside the row_wise_historgram function!
I think the main issue you have in your code is that your kernel has a thread strategy to have each thread process a row, and you have 2 rows in your feature dataset, but you are launching 12 threads total:
### Kernal Configuration
threads_per_block = 6
blocks = 2
10 of those threads will be indexing out-of-bounds. For 2 rows you only need 2 threads. We can fix this multiple ways, but I will add a "thread-check" to your kernel code, to prevent out-of-bounds threads from doing anything.
You are also histogramming values that don't fit in your output array. Let's suppose your feature has a input value of 4 at some location. Let's put that value through your arithmetic:
bin_number = np.int32(nbins * (np.float32(4) - np.float32(-4)) / (np.float32(4) - np.float32(-4)))
That is 10 * (4-(-4))/(4-(-4))
So that is a bin index of 10. But you only have 10 bins, so valid bin index can only go up to 9. Which means some of your input values (e.g. 4, 5) will not be recorded in your output.
The following code is your code to add the threadcheck, plus the range of input adjusted. And I am printing out the input, the bin each input value was assigned, and the output bins. It seems to be working correctly.
$ cat t65.py
import numba
import numpy as np
from numba import cuda
np.random.seed(0)
feature = np.random.randint(1, high=4, size=(2,6), dtype=int)
output = np.zeros(20).astype(np.float32).reshape(2,10)
mybin = np.empty_like(feature)
### Kernal Configuration
threads_per_block = 6
blocks = 2
# moving data to device
d_feature = cuda.to_device(feature)
d_output = cuda.to_device(output)
feature_size = d_feature.shape[1]
d_mybin = cuda.to_device(mybin)
#cuda.jit
def row_wise_histogram(feature, output, mybin, n):
xmin = np.float32(-4.0)
xmax = np.float32(4.0)
idx = cuda.grid(1)
nbins = 10
bin_width = (xmax - xmin) / nbins
if idx < output.shape[0]:
for i in range(n):
# Each thread will take all the row features to generate historgram
input = feature[idx][i]
bin_number = np.int32(nbins * (np.float32(input) - np.float32(xmin)) / (np.float32(xmax) - np.float32(xmin)))
mybin[idx][i] = bin_number
if bin_number >= 0 and bin_number < output.shape[1]:
cuda.atomic.add(output[idx], bin_number, 1)
row_wise_histogram[blocks, threads_per_block](d_feature, d_output, d_mybin, feature_size)
print(feature)
print(d_mybin.copy_to_host())
print(d_output.copy_to_host())
$ python t65.py
[[1 2 1 2 2 3]
[1 3 1 1 1 3]]
[[6 7 6 7 7 8]
[6 8 6 6 6 8]]
[[ 0. 0. 0. 0. 0. 0. 2. 3. 1. 0.]
[ 0. 0. 0. 0. 0. 0. 4. 0. 2. 0.]]
$ cuda-memcheck python t65.py
========= CUDA-MEMCHECK
[[1 2 1 2 2 3]
[1 3 1 1 1 3]]
[[6 7 6 7 7 8]
[6 8 6 6 6 8]]
[[ 0. 0. 0. 0. 0. 0. 2. 3. 1. 0.]
[ 0. 0. 0. 0. 0. 0. 4. 0. 2. 0.]]
========= ERROR SUMMARY: 0 errors
$
Note that when I restrict the input values to 1..3, then the maximum bin index is 8 (do the math). If I increase the input range to include 4, the maximum bin index goes to 10, which "won't fit". You're correctly handling this case, but it may confuse you, as these values of 4 or 5 won't be recorded in the output. Histogram bin arithmetic is fun. You will need to work out exactly what you want.
Also note that if you run this code, you should see output almost exactly the same as above. If you don't, there is a good chance your numba or cuda install is broken somehow, and the additional run I show with cuda-memcheck will help to discover what may be the issue.
Note that since you are using atomics anyway, there isn't any particular need to assign one thread to each row, you could instead assign one thread to each input point. But that isn't your question; it's a story for another day. Conversely, if you do proceed with one thread per row, each thread doing effectively a private histogram, there is no particular need to use atomics.
So, I'd like to write larger functions in Commodore 64 BASIC. So far, from what I'm seeing from other sources (such as various C64 wikis, as well as the user's manual for the C64 itself,) function definitions can only be one line long.
That is to say, I can't seem to find an analogous construct in BASIC to brackets/whatever else other languages use to delineate code blocks.
Does anyone know how I'd write code blocks in BASIC that are more than one line?
Example of one-line function:
10 def fn X(n) = n + 1
20 print fn X(5) rem Correctly called function. This will output 6
But I can't do something like:
10 def fn X(n) =
20 n = n + 1
30 print n
40 rem I'd like the definition of function X to end at line 30 above
50 fn X(5) rem Produces syntax error on line 40
Thank you for your time!
Sadly C64 BASIC doesn't support more complicated functions.
It does however support more complicated subroutines, and that's what you want in this case.
10 rem you can set up n in advance here
20 n = 23
30 gosub 50
40 rem n is now 24
50 rem start of subroutine; this line is not needed, it's just here for clarity
60 n=n+1
70 print n
80 return
90 rem now you can call the subroutine on line 50 and it'll return at line 80
Unfortunately passing parameters in to and returning values out of subroutines in C64 BASIC aren't formalized constructs, so you'll just have to work with ordinary variables as shown above.
From what I recall, you can do this virtually using a colan to have multiple commands on one line. Not the most elegant solution, but will let you break things up:
10 def fn X(n) =
20 n = n + 1
30 print n
40 rem I'd like the definition of function X to end at line 30 above
50 fn X(5) rem Produces syntax error on line 40
Becomes
10 n=n+1: print n
Note that you cannot pass arguments, so you would have to declare things and let the BASIC stack take care of it for you. Commonly I would structure programs like so:
1 rem lines 1-99 are definitions.
2 n% = 0 : rem this declares the variable n as an integer, initializing it to 0
100 rem lines 100-59999 are the core code
101 n%=5 : gosub 60100
59999 end : rem explicit end of the program to ensure we don't run into our subroutine block
60000 rem lines 60000+ are my subroutines..
60100 n% = n% + 1 : print n% : return
It's been a while; from memory the % character is what declares a variable as an integer, similar to $ declaring it as a string.
You can use existing variables and mathematical commands with DEF FN, for instance, if you wanted to PRINT 0 to 10 inclusive in 4-bit nybbles, one could do this:
0 DEF FN B(X)=SGN(X AND B)
1 FOR I=0 TO 10: REM OUR COUNTER
2 B=8: REM OUR BIT MARKER (128, 64, 32, 16, 8, 4, 2, 1)
3 FOR J=0 TO 3: REM WE WANT 4-BIT NYBBLES, SO 0 TO 3 INCLUSIVE
4 PRINT RIGHT$(STR$(FN B(I)),1);: REM CALLS OUR FUNCTION
5 B=B/2: REM MOVES TO NEXT BIT MARKER
6 NEXT J: REM PROCESS FOR LOOP J
7 PRINT: NEXT I: REM NEW LINE THEN PROCESS FOR LOOP I
I've tried nesting functions but it gets too confusing. In fact, I haven't seen many listings that use DEF FN. Maybe some high-brow artisan hipster BASIC programmers use them?
Let i = [1 2] and j = [3 5]. Now in octave:
arrayfun(#(x,y) x+y,i,j)
we get [4 7]. But I want to apply the function on the combinations of i vs. j to get [i(1)+j(1) i(1)+j(2) i(2)+j(1) i(2)+j(2)]=[4 6 5 7].
How do I accomplish this? I know I can go with for-loopsl but I want vectorized-code because it's faster.
In Octave, for finding summations between two vectors, you can use a truly vectorized approach with broadcasting like so -
out = reshape(ii(:).' + jj(:),[],1)
Here's a runtime test on ideone for the input vectors of size 1 x 100 each -
-------------------- With FOR-LOOP
Elapsed time is 0.148444 seconds.
-------------------- With BROADCASTING
Elapsed time is 0.00038299 seconds.
If you want to keep it generic to accommodate operations other than just summations, you can use anonymous functions like so -
func1 = #(I,J) I+J;
out = reshape(func1(ii,jj.'),1,[])
In MATLAB, you could accomplish the same with two bsxfun alternatives as listed next.
I. bsxfun with Anonymous Function -
func1 = #(I,J) I+J;
out = reshape(bsxfun(func1,ii(:).',jj(:)),1,[]);
II. bsxfun with Built-in #plus -
out = reshape(bsxfun(#plus,ii(:).',jj(:)),1,[]);
With the input vectors of size 1 x 10000 each, the runtimes at my end were -
-------------------- With FOR-LOOP
Elapsed time is 1.193941 seconds.
-------------------- With BSXFUN ANONYMOUS
Elapsed time is 0.252825 seconds.
-------------------- With BSXFUN BUILTIN
Elapsed time is 0.215066 seconds.
First, your first example is not the best because the most efficient way to accomplish what you're doing with arrayfun would be to vectorize:
a = [1 2];
b = [3 5];
out = a+b
Second, in Matlab at least, arrayfun is not necessarily faster than a simple for loop. arrayfun is mainly a convenience (especially for it's more advanced options). Try this simple timing example yourself:
a = 1:1e5;
b = a+1;
y = arrayfun(#(x,y)x+y,a,b); % Warm up
tic
y = arrayfun(#(x,y)x+y,a,b);
toc
y = zeros(1,numel(a));
for k = 1:numel(a)
y(k) = a(k)+b(k); % Warm up
end
tic
y = zeros(1,numel(a));
for k = 1:numel(a)
y(k) = a(k)+b(k);
end
toc
In Matlab R2015a, the for loop method is over 70 times faster run from the Command window and over 260 times faster when run from an M-file function. Octave may be different, but you should experiment.
Finally, you can accomplish what you want using meshgrid:
a = [1 2];
b = [3 5];
[x,y] = meshgrid(a,b);
out = x(:).'+y(:).'
which returns [4 6 5 7] as in your question. You can also use ndgrid to get output in a different order.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I love challenges like this, I'll hopefully submit my answer soon.
Which player has the best 7 card hand?
Given an unordered list of 9 cards (separated by a space), work out which player has the best poker hand. Here is a list of poker hand rankings. Example input:
2C 5H AS KS 2D 4D QD KH 3S
(ie: [[2C 5H] [AS KS] [2D 4D QD KH 3S]])
First 2 cards in the array represent player 1's hand, second 2 in the array represent player 2's hand. The last 5 cards represent the community cards, cards both players share. In effect, both players have 7 cards, and you must determine which player has the best 5 card poker hand.
A card is defined as a string, with the first character representing the card value, and the second value representing the suit. Always upper-case. No card may appear twice.
The function will calculate if the hand is a draw or a win to either player. It will ouput the totals at the end of the input. The output format is defined later on in this post.
Examples
2C 5H AS KS 2D 4D QD KH 3S
(ie: [[2C 5H] [AS KS] [2D 4D QD KH 3S]])
Player 2 wins this hand. Player 1 has a pair of 2's, player 2 has a pair of kings.
5S 6S 8H 9D 7S 8S JH TS 2H
(ie: [[5S 6S] [8H 9D] [7S 8S JH TS 2H]])
Player 1 wins this hand Player 1 has a flush, player 2 has a straight.
2S 2H AC AS 2C AH 9H TS 2D
(ie: [[2S 2H] [AC AS] [2C AH 9H TS 2D]])
Player 1 wins this hand. Player 1 has quads, player 2 has a full house
5S 6S 2D 4D 9S AS KD JC 9D
(ie: [[5S 6S] [2D 4D] [9S AS KD JC 9D]])
A draw. Both players have Ace high.
More Info
Thanks to mgroves for the following link to Project Euler which has a similar problem:
http://projecteuler.net/index.php?section=problems&id=54
Test Data
We will use the Project Euler test data:
http://projecteuler.net/project/poker.txt
Your solution should accept that text file as input, and output a total of wins and draws.
Example Output
Output must be in this format:
1: 45
2: 32
D: 12
Player 1 won 45 hands, player 2 won 32 hands, and there were 12 draws. (Not actual results)
Rules
Doesn't have to return the winning hand type, only WHO won if anyone
Card list input has no particular order
No card appears twice in the input
Input is always uppercase
Takes the Project Euler test data as an input
Outputs a count, of which player won the most hands and total draws in given format above
Perl, 414 398 370/458 344/416 char
Line breaks are not significant.
%M=map{$_,$Z++}0..9,T,J,Q,K,A;sub N{/.$/;$M{$`}.$&}
sub B{$s=#p=();
for$m(#_){$m-$_||($s+=2,++$p[$m])for#_}
#_=sort{$p[$b]-$p[$a]||$b-$a}#_;
$s=23 if$s<11&&($_[0]-$_[4]<5||$_[0]-$_[1]>8&&push#_,shift);
"#_"=~/.$/;$s+=14*(4<grep/$&/,#_);
$s=100*$s+$_ for#_;$s}
++$X{B((#c=map{N}split)[0..4])<=>B(#c[5..9])}for<>;
printf"1: %d\n2: %d\nD: %d\n",#X{1,-1,0}
This solves the "10 card" problem (10 cards are dealt, player 1 has the first 5 cards and player 2 has the second 5 cards).
The first section defines a subroutine N that can transform each card so that it has a numerical value. For non-face cards, this is a trivial mapping (5H ==> 5H) but it does transform the face cards (KC => 13C, AD => 14D).
The last section parses each line of input into cards, transforms the cards to contain numerical values, divides the cards into separate hands for the two players, and analyzes and compares those hands. Every hand increments one element of the hash %X. When all the input is parsed, %X contains the number of hands won by player 1, won by player 2, or ties.
The middle section is a subroutine that takes a set of five cards as input and produces a
12-digit number with the property that stronger poker hands will have higher-valued numbers. Here's how it works:
for$m(#_){$m-$_||($s+=2,++$p[$m])for#_}
This is the "pair" detector. If any two cards have the same numerical value, increment a hash element for one of the cards and increase the "score" variable $s by two. Note that we will end up comparing each card to itself, so $s will be at least 10 and $p[$x] will be at least one for every card $x. If the hand contains three of a kind, then those three cards will match with the other two cards -- it will be like there are 9 matches among those three cards and the "score" will be at least 18.
#_=sort{$p[$b]-$p[$a]||$b-$a}#_;
Sort the cards by (1) the number of times that card is part of a "pair" and (2) the value of the card. Thus in a hand with two 7's and two 3's, the two 7's will appear first, followed by the two 3's, followed by the kicker. In a hand with two 7's and three 3's, the three 3's will be first followed by the two 7's. The goal of this ordering is to distinguish two hands that have the same score -- a hand with a pair of 8's and a hand with a pair of 7's both have one pair, but we need to be able to tell that a pair of 8's is better.
$s=23 if$s<11&&($_[0]-$_[4]<5||$_[0]-$_[1]>8&&push#_,shift);
This line is the "straight" detector. A straight is worth 23 points and occurs when there are no pairs in the hand ($s<11 means only 5 "pairs" - each card matching with itself - were found) and either (1) the value of the highest card is exactly four more than the value of the lowest card ($_[0]-$_[4]==4), or (2) the highest value card is an Ace and the next highest card is a 5 ($_[0]-$_[1]==9), which means the hand has an A-2-3-4-5 straight. In the latter case, the Ace is now the least valuable card in the hand, so we manipulate #_ to reflect that (push#_,shift)
"#_"=~/.$/;$s+=14*(4<grep/$&/,#_);
This line is the flush detector. A flush is worth 14 more points and occurs when the last character is the same for each card. The first expression ("#_"=~/.$/) has the side effect of setting $& to the last character (the suit) of the last card in the hand. The final expression (4<grep/$&/,#_) will be true if and only if all elements of #_ have the same last character.
$s=100*$s+$_ for#_;$s}
Creates and returns a value that begins with the hand's score and then contains the values of the cards, in order of the card's importance. Scores for the various hands will be
Hand Score
---------- ------
High card 10 (each card matches itself for two points)
One pair 14 (2 additional matches)
Two pair 18 (4 additional matches)
Three of a kind 22 (6 additional matches)
Straight 23 (no pair, but 23 points for straight)
Flush 24 (no pair, but 14 additional points for the flush)
Full house 26 (8 additional matches)
4 of a kind 34 (12 additional matches)
Straight flush 37 (23 + 14 points)
which is consistent with the rules of poker. Hands with the same score can be distinguished by the values of the hand's cards, in order of importance to the hand, all the way down to the least valuable card in the hand.
The solution to the 9 card problem (two cards to player 1, two cards to player 2, the players share the next 5 cards and build their best 5 card hand) needs about 70 more strokes to choose the best 5 card hand out of the 7 cards available to each player:
%M=map{$_,$Z++}0..9,T,J,Q,K,A;sub N{/./;$M{$&}.$'}
sub A{my$I;
for$k(0..41){#d=#_;splice#d,$_,1for$k%7,$k/7;$s=#p=();
for$m(grep$_=N,#d){$m-$_||($s+=2,$p[$m]++)for#d}
#d=sort{$p[$b]-$p[$a]||$b-$a}#d;
$s=23 if$s<11&&($d[0]-$d[4]<5||$d[0]-$d[1]>8&&push#d,shift#d);
"#d"=~/.$/;$s+=14*(4<grep/$&/,#d);
$s=100*$s+$_ for#d;
$I=$s if$s>$I}$I}
++$X{A((#c=split)[0,1,4..8])<=>A(#c[2..8])}for<>;
printf"1: %d\n2: %d\nD: %d\n",#X{1,-1,0}
GolfScript - 151/187 chars
This program works on an input list of 10 cards per line, i.e. two 5 card hands.
n%0.#{3/5/{[zip~;.&,(!15*\[{n),*"TJQKA"+?}/]:|$),-4>=14*+1|{.2\?|#-,5\-.49?#*#+\.+#+\}/.16445=13*#+\]}%.~={0):0;;}{~>.!#+\#+\}if}/"1: "##n"2: "#n"D: "0
This program works on an input list of 9 cards per line, of the format described in the specifications.
n%0.#{3/.4>:D;2/2<{D+.{3/1$^.{3/1$^[zip~;.&,(!15*\[{n),*"TJQKA"+?}/]$:|),-4>=14*+1|{.2\?|#-,5\-.49?#*#+\.+#+\}/.16445=13*#+\]}%\;~}%$-1=\;}%.~={0):0;\(\}*~>.!#+\#+\}/"1: "##n"2: "#n"D: "0
Haskell: 793 796 806 826 864 904 901 880 863
Since the text file is inconsistent with 9 card hands, I'm just reading a line from the console and outputting who wins.
Bugfixes:
Ace now counts lower than a 2 in an ace-low run.
Comparing full houses fixed (again :D).
Guarantees that the best version of a given hand type is chosen. For example, if a player can choose between a 2-6 run and a 3-7 run, the 3-7 run is chosen (flushes aside).
Now shorter than the PHP solution!
Golfed:
import Data.List
(%)=mod
m=map
y=foldr1
t=0<1
z=13
w=[0,1,2,3,12]
n&x|length x<n=[]|t=take n x
b?x|b=x|t=[]
n!k= \c->e(n&m(%k)c)?(n&c)
e[]=1<1
e(x:y)=all(x==)y
k q c|any null[q c,p$c\\q c]=[]|t=q c
f=5!4
s c=(sort(m(%z)c)`elem`w:[[n..n+4]|n<-[0..8]])?c
r=3!z
p=2!z
g x y|c x y<2=x|t=y
q x(_,[])=x
q _ y=y
b h=y q$m($h)$zipWith(\t f->(,)t.y g.m(f.take 5).permutations)[1..][1!1,p,k p,r,s,f,k r,4!z,s.f]
h=reverse.a.m(%z)
a v|w\\v==[]=[-1..3]|t=sort v
c x y=o(h x)$h y
o[](_:_)=2
o[]_=0
o _[]=1
o(a:b)(k:d)|a>k=1|a<k=2|t=o b d
d n(a,k)|a==[]=0|n<1=0|r>s=1|r<s=2|f/=0=f|t=d(n-length o)(a\\o,k\\u)where(r,o)=b a;(s,u)=b k;f=c o u
i x=head.findIndices(x==)
u(n:k)c#[r,s]|n%z==i r"23456789TJQKA"&&n%4==i s"HDSC"=n|t=u k c
l c=(2&c++snd(splitAt 4c),drop 2c)
main=getLine>>=print.d 5.l.m(u[0..]).words
Ungolfed:
import Control.Exception (assert)
import Data.List (permutations, sort, intersect, findIndices, (\\))
import Data.Function (on)
(%) = mod
aceLowRun = [0,1,2,3,12]
tryTake n xs
| length xs < n = []
| otherwise = take n xs
cond ? xs
| cond = xs
| otherwise = []
eqOn n f cards = allEq (tryTake n $ map f cards) ? tryTake n cards
allEq [] = False
allEq (x:xs) = all (== x) xs
combWithPair pokerHand cards
| any null [picked1, picked2] = []
| otherwise = pokerHand cards
where
picked1 = pokerHand cards
picked2 = pair $ cards \\ picked1
straightFlush = straight . flush
quads = eqOn 4 (% 13)
fullHouse = combWithPair triples
flush = eqOn 5 (% 4)
straight cards = (sort (map (% 13) cards) `elem` runs) ? cards
where
runs = aceLowRun : [[n..n+4] | n <- [0..8]]
triples = eqOn 3 (% 13)
twoPair = combWithPair pair
pair = eqOn 2 (% 13)
single = eqOn 1 id
bestVersionOfHand [] ys = ys
bestVersionOfHand xs [] = xs
bestVersionOfHand xs ys
| compareSameRankedHands xs ys < 2 = xs
| otherwise = ys
rate rating pokerHand cards = (rating, handResult)
where
handResult = foldr1 bestVersionOfHand
(map (pokerHand . take 5) $ permutations cards)
pokerHands = zipWith rate [1..] [
single
, pair
, twoPair
, triples
, straight
, flush
, fullHouse
, quads
, straightFlush
]
bestHand hand = foldr1 (\xs ys -> if null (snd ys) then xs else ys)
(map ($ hand) pokerHands)
highestVals = reverse . arrangeVals . map (% 13)
where
arrangeVals vals = if vals `intersect` aceLowRun == aceLowRun
then [-1..3]
else sort vals
compareSameRankedHands = compareSameRankedHands' `on` highestVals
compareSameRankedHands' [] [] = 0
compareSameRankedHands' (card1:cards1) (card2:cards2)
| card1 > card2 = 1
| card1 < card2 = 2
| otherwise = compareSameRankedHands' cards1 cards2
decideWinner n cards1 cards2
| null cards1 = assert (null cards2) 0
| n < 1 = 0
| rating1 > rating2 = 1
| rating1 < rating2 = 2
| cmpRes /= 0 = cmpRes
| otherwise = decideWinner
(n - assert (length bests1 == length bests2) (length bests1))
(cards1 \\ bests1)
(cards2 \\ bests2)
where
(rating1, bests1) = bestHand cards1
(rating2, bests2) = bestHand cards2
cmpRes = compareSameRankedHands bests1 bests2
indexOf x = head . findIndices (x==)
toNum = toNum' [0..]
toNum' (n:ns) [rank, suit]
| n % 13 == indexOf rank "23456789TJQKA" && n % 4 == indexOf suit "HDSC" = n
| otherwise = toNum' ns [rank, suit]
cluster cards = (take 2 cards ++ snd (splitAt 4 cards), drop 2 cards)
main = getLine >>= print
. uncurry (decideWinner 5)
. cluster
. map toNum
. words
GolfScript 258 241 247/341 217/299 char
Solution for the 10 card problem. Only the last couple of newlines are significant:
10:T):J):Q):K):A;0:a;0:b;0:d;"\r\n"%{' '/5/{.{)\;}/4*-+++!:f;{);~}%{$0:z(%{.z-
!99*+:z}%}5*.{+}*99/:P!{..)\(#4+-!2*\;\.2<~9+-!\;+}and:s;[s f*6P=4P=f s P 6$]\;}
%.~={;;d):d;}{~>{a):a;}{b):b;}if}if}/
'1: 'a'
2: 'b'
D: 'd n
The 9 card problem currently needs about 80 more characters.
10:T):J):Q):K):A;0:a;0:b;0:d;"\r\n"%{' '/);{('Z'%+}2*[0$2>\7<]
{:H;7,{H=:x;H{x=!},:I;6,{I=:x;I{x=!},}/}%{.{)\;}/4*-+++!:f;
{);~}%{$0:z(%{.z-!99*+:z}%}5*.{+}*99/:P!{..)\(#4+-!2*\;\.2<~9+-!\;+}and:s;[
s f*6P=4P=f s P 6$]\;}%{[\].~>{~;}{~\;}if}*}%.~={;;d):d;}{~>{a):a;}{b):b;}if}if}/
'1: 'a'
2: 'b'
D: 'd n
Less golfed version of 10 card problem.
10:T;11:J;12:Q;13:K;14:A; # map for face cards
0:a;0:b;0:d; # other initialization
"\r\n"% # split input on \n
{ # on each line of input
' '/ # divide line into ten cards
5/ # split into five card hands
{. # on each of the two hands
{)\;}% # chop last character of each card
.(5*\;\{+}*= # check sum of elem == 5*1st elem
:f; # this is the flush flag
{);~}%$ # reduce cards to numerical values
0:z;{.z- 20%{}
{;z 20+}if:z}%{-1*}$ # detect pairs
.(:h;; # extract value of highest card
20h>{..)\(#4+-!2*\;\ # detect straight
.2<~9+-!\;+}and:s; # s=2 for regular straight, s=1 for A-5 straight
# result of this mapping - 6 elem array
[ 0$ # #6 - cards in the hand
.{20/}%{+}*:P # #5 - number of pairs
s # #4 - is this a straight?
f # #3 - is this a flush?
4P= # #2b - is this a full house?
h 59> # #2 - is this 4 of a kind?
s f * # #1 - is this a straight flush?
]-1%
\;
}/
\.#.# # put [hand1 hand2 hand1 hand2] on stack
= # check hand1==hand2
{;;d):d;} # if equal, increment d (draw)
{>{a):a;} # if >, increment a (player 1 wins)
{b):b;}if # if <, increment b (player 2 wins)
}if
}/
# output results
'1: 'a'
2: 'b'
D: 'd n
C, 665+379 chars
Here's my answer in 2 parts.
The first is a complete 7 card evaluator, including the "AddCard" macro A. It returns a 32-bit number ranking the hand. The high nibble is the type, bits 13..25 indicate the high card(s) and bits 0..12 indicate the kicker(s). When comparing the results, the better hand will always have the larger value.
#define U unsigned
#define c(a)X=a;i=C=0;while(X){C|=(X&1)<<i++;X/=4;}
#define A(h,c)h[c&7]+=c,h[3]|=c
U C,i,X;
U E(U h[]){
U a=h[0]+h[1]+h[2]+h[4]-(h[3]&-16),t,v,k,e=a&0x55555540,o=a&0xAAAAAA80;
if(v=e&o/2){t=7;k=h[3]^v;i=0;while(k/=4)i++;k=1<<2*i;}
else if(v=o&o-1){t=6;v/=2;k=o/2^v;}
else if(e>1&o>1){t=6;v=o/2;k=(i=e&e-1)?i:e;}
else{a=h[3];
if(t=h[i=1]-(a&1)&4||h[i=2]-(a&2)&8||h[i=4]-(a&4)&16||h[i=0]-(a&8)&32)a=h[i];
a&=-64;v=a|a>>26&16;t*=5;
if(v=v&v<<2&v<<4&v<<6&v<<8){t+=4;a=v&=~(v/2);}
else if(t)for(i=(h[i]&63)/(i?i:8),v=a;i-->5;)a&=a-1;
else if(v=o/2)t=3;
else if (e){o=e&e-1;v=(i=o&o-1)?o:e;t=1+(o>0);}
k=a^v;k&=k-1;k&=k-(i==0);}
c(v);v=C/8;c(k);
return t<<28|v<<13|C/8;}
The second is the input processor. It parses the project Euler file as 2+2+5 cards (ignoring the 10th card). It uses the Parse macro, P to create 32-bit values representing each card. The representation is 0A0K0Q0J0T090807060504030200shdc. A hand is stored as an array of 5 ints.
char*gets(char*);char*strchr(char*,char);
#define P(c)X=strchr(R,*c++)-R;C=1<<strchr(S,*c++)-S|64<<X*2;c++;
#define L(n)for(i=0;i<n;i++)
U g[5],h[5];
char*c,b[32];
char*S="CDHS";
char*R="23456789TJQKA";
int d,r[3]={0};
main(q){while(c=gets(b)){
L(2){P(c)A(g,C);}
L(2){P(c)A(h,C);}
L(5){P(c)A(g,C);A(h,C);}
d=E(g)-E(h);
r[d>0?0:d<0?1:2]++;
L(7)g[i]=h[i]=0;
}L(3)printf("%c:%d\n","12D"[i],r[i]);}
I'm sure there are a few more characters to be trimmed off. I'll add an explanation soon.
The evaluator runs #17.6 Million hands/second on my 3Ghz Core2 Duo. That's only 3.5x slower than the PokerSource evaluator, which uses at least 56K of lookup tables.
PHP, 799 chars
Line breaks are not significant. This takes input from the linked url, which is different from the example input (doesn't deal with community cards). Processing is similar to mobrule's perl answer, with a different scoring method.
<?php
function s($i){$o=array_map('intval',$i);$f=(count(array_unique(str_replace($o,'',$i)))==1);
sort($o);$v=array_count_values($o);arsort($v);$u=array_keys($v);$h=max($u);$m=$u[0];$c=reset($v);
$p=count($v);$e=$c==1&&$o[4]==14&&$o[3]==5;$r=$o==range($o[0],$o[0]+4)||$e;$q=$e?5:$h;
$s=($f&&$r&&($h==12)?2<<11:($f&&$r?(2<<10)+$q:0))+($c==4?(2<<9)+$m:0)+($c==3&&$p==2?(2<<8)+$m:0)+($f?(2<<7)+$h:0)+
($r?(2<<6)+$q:0)+($c==3?(2<<5)+$m:0)+($c==2&&$p==3?(2<<4)+$m:0)+($p==4?(2<<3)+$m:0);$s+=!$s?$h:0;return array($s,$u);}
foreach(file($argv[1]) as $d){
list($y,$z)=array_chunk(explode(' ',trim(strtr($d,array('T'=>10,'J'=>11,'Q'=>12,'K'=>13,'A'=>14)))),5);
$y=s($y);$z=s($z);$w=$y[0]-$z[0];$x=1;while(!$w&&$x<5){$w=$y[1][$x]-$z[1][$x++];}if(!$w)#$t++;elseif($w<0)#$l++;else #$k++;}
#print "1: $k\n2: $l\nD: $t";