Consider the following kernel, which reduces along the rows of a 2-D matrix
function row_sum!(x, ncol, out)
"""out = sum(x, dims=2)"""
row_idx = (blockIdx().x-1) * blockDim().x + threadIdx().x
for i = 1:ncol
#inbounds out[row_idx] += x[row_idx, i]
end
return
end
N = 1024
x = CUDA.rand(Float64, N, 2*N)
out = CUDA.zeros(Float64, N)
#cuda threads=256 blocks=4 row_sum!(x, size(x)[2], out)
isapprox(out, sum(x, dims=2)) # true
How do I write a similar kernel except for reducing along the columns (of a 2-D matrix)? In particular, how do I get the index of each column, similar to how we got the index of each row with row_idx?
Here is the code:
function col_sum!(x, nrow, out)
"""out = sum(x, dims=1)"""
col_idx = (blockIdx().x-1) * blockDim().x + threadIdx().x
for i = 1:nrow
#inbounds out[col_idx] += x[i, col_idx]
end
return
end
N = 1024
x = CUDA.rand(Float64, N, 2N)
out = CUDA.zeros(Float64, 2N)
#cuda threads=256 blocks=8 col_sum!(x, size(x, 1), out)
And here is the test:
julia> isapprox(out, vec(sum(x, dims=1)))
true
As you can see the size of the result vector is now 2N instead of N, hence we had to adapt the number of blocks accordingly (that is multiply by 2 and now we have 8 instead of 4)
More materials can be found here: https://juliagpu.gitlab.io/CUDA.jl/tutorials/introduction/
I have this function implemented in Cython:
def count_kmers_cython(str string, list alphabet, int kmin, int kmax):
"""
Count occurrence of kmers in a given string.
"""
counter = {}
cdef int i
cdef int j
cdef int N = len(string)
limits = range(kmin, kmax + 1)
for i in range(0, N - kmax + 1):
for j in limits:
kmer = string[i:i+j]
counter[kmer] = counter.get(kmer, 0) + 1
return counter
Can I do better with cython? Or Can I have any away to improve it?
I am new to cython, that is my first attempt.
I will use this to count kmers in DNA with alphabet restrict to 'ACGT'. The length of the general input string is the average bacterial genomes (130 kb to over 14 Mb, where each 1 kb = 1000 bp).
The size of the kmers will be 3 < kmer < 16.
I wish to know if I could go further and maybe use cython in this function to:
def compute_kmer_stats(kmer_list, counts, len_genome, max_e):
"""
This function computes the z_score to find under/over represented kmers
according to a cut off e-value.
Inputs:
kmer_list - a list of kmers
counts - a dictionary-type with k-mers as keys and counts as values.
len_genome - the total length of the sequence(s).
max_e - cut off e-values to report under/over represented kmers.
Outputs:
results - a list of lists as [k-mer, observed count, expected count, z-score, e-value]
"""
print(colored('Starting to compute the kmer statistics...\n',
'red',
attrs=['bold']))
results = []
# number of tests, used to convert p-value to e-value.
n = len(list(kmer_list))
for kmer in kmer_list:
k = len(kmer)
prefix, sufix, center = counts[kmer[:-1]], counts[kmer[1:]], counts[kmer[1:-1]]
# avoid zero division error
if center == 0:
expected = 0
else:
expected = (prefix * sufix) // center
observed = counts[kmer]
sigma = math.sqrt(expected * (1 - expected / (len_genome - k + 1)))
# avoid zero division error
if sigma == 0.0:
z_score = 0.0
else:
z_score = ((observed - expected) / sigma)
# pvalue for all kmers/palindromes under represented
p_value_under = (math.erfc(-z_score / math.sqrt(2)) / 2)
# pvalue for all kmers/palindromes over represented
p_value_over = (math.erfc(z_score / math.sqrt(2)) / 2)
# evalue for all kmers/palindromes under represented
e_value_under = (n * p_value_under)
# evalue for all kmers/palindromes over represented
e_value_over = (n * p_value_over)
if e_value_under <= max_e:
results.append([kmer, observed, expected, z_score, p_value_under, e_value_under])
elif e_value_over <= max_e:
results.append([kmer, observed, expected, z_score, p_value_over, e_value_over])
return results
OBS - Thank you CodeSurgeon by the help. I know there are other tools to count kmer efficiently but I am learning Python so I am trying to write my own functions and code.
You may not understand what I wrote clearly because English is not my first language.
Anyway, here is what I wrote.
public class Exercises7point11 {
public static void main(String[] args) {
java.util.Scanner input = new java.util.Scanner(System.in);
int[][] binaryNumber = {{0,0,0},{0,0,0},{0,0,0}};
System.out.print("Enter a number between 0 and 511: ");
int decimalNumber = input.nextInt();
int subtractNumber = 256, number = decimalNumber;
for (int row = 0 ; row < 3; row++){
for (int column = 0 ; column < 3; column++) {
if(number >= subtractNumber) {
binaryNumber[row][column] = 1;
number = number - subtractNumber;
}
else {
subtractNumber = subtractNumber / 2;
binaryNumber[row][column] = 0;
}
}
}
// print
for (int row = 0; row < binaryNumber.length; row++){
for (int column = 0; column < binaryNumber[row].length; column++){
if (binaryNumber[row][column] == 1)
System.out.print("T ");
else if (binaryNumber[row][column] == 0)
System.out.print("H ");
if (column == 2)
System.out.print("\n");
}
}
}
Here is the details. Nine coins are placed in a 3-by-3 matrix with some face up and some face down. You can represent the state of the coins using a 3-by-3 matrix with values 0 (heads) and 1 (tails).
Such as,
1 0 0
0 1 0
1 1 0.
There are a total of 512 possibilities, so I can use decimal numbers 0, 1, 2, 3,..., 511 to represent all states of the matrix. Write a program that prompts the user to enter a number between 0 and 511 and displays the corresponding matrix with the characters H and T.
My problem is "subtractNumber = subtractNumber / 2; binaryNumber[row][column] = 0;" in the 18 and 19 lines. Even though 'number is greater than or equal to subtractNumber, 18 and 19 lines are read.
I don't know how I can fix it.
Thank you so much!!
The output is a 3x3 matrix, but that doesn't mean you need to use a 2d array.
Here's a simpler approach. We're turning the user's input into a binary string using the toBinaryString method, then translating the binary string into H/T.
public static void main(String[] args) {
Scanner sc = new Scanner(System.in);
System.out.print("Enter a number between 0 and 511: ");
int input = sc.nextInt();
// Turn input to binary string
String binary = Integer.toBinaryString(input);
// Add enough zeros in front so that the string has 9 characters
binary = binary.format("%09d", Integer.parseInt(binary));
// Iterate through binary string one char at a time
for (int i = 1; i < 10; i++) {
if ('0' == binary.charAt(i - 1)) {
System.out.print("H ");
} else {
System.out.print("T ");
}
// New line after 3 letters
if (i % 3 == 0) {
System.out.println();
}
}
}
Example output
12 in binary is 000001100
Enter a number between 0 and 511: 12
H H H
H H T
T H H
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What's more appropriate than a Spiral for Easter Code Golf sessions? Well, I guess almost anything.
The Challenge
The shortest code by character count to display a nice ASCII Spiral made of asterisks ('*').
Input is a single number, R, that will be the x-size of the Spiral. The other dimension (y) is always R-2. The program can assume R to be always odd and >= 5.
Some examples:
Input
7
Output
*******
* *
* *** *
* * *
***** *
Input
9
Output
*********
* *
* ***** *
* * * *
* *** * *
* * *
******* *
Input
11
Output
***********
* *
* ******* *
* * * *
* * *** * *
* * * * *
* ***** * *
* * *
********* *
Code count includes input/output (i.e., full program).
Any language is permitted.
My easily beatable 303 chars long Python example:
import sys;
d=int(sys.argv[1]);
a=[d*[' '] for i in range(d-2)];
r=[0,-1,0,1];
x=d-1;y=x-2;z=0;pz=d-2;v=2;
while d>2:
while v>0:
while pz>0:
a[y][x]='*';
pz-=1;
if pz>0:
x+=r[z];
y+=r[(z+1)%4];
z=(z+1)%4; pz=d; v-=1;
v=2;d-=2;pz=d;
for w in a:
print ''.join(w);
Now, enter the Spiral...
Python (2.6): 156 chars
r=input()
def p(r,s):x=(i+1)/2;print "* "*x+("*" if~i%2 else" ")*(r-4*x)+" *"*x+s
for i in range(r/2):p(r,"")
for i in range((r-1)/2-1)[::-1]:p(r-2," *")
Thanks for the comments. I've removed extraneous whitespace and used input(). I still prefer a program that takes its argument on the command-line, so here's a version still using sys.argv at 176 chars:
import sys
r=int(sys.argv[1])
def p(r,s):x=(i+1)/2;print "* "*x+("*" if~i%2 else" ")*(r-4*x)+" *"*x+s
for i in range(r/2):p(r,"")
for i in range((r-1)/2-1)[::-1]:p(r-2," *")
Explanation
Take the spiral and chop it in two almost-equal parts, top and bottom, with the top one row bigger than the bottom:
***********
* *
* ******* *
* * * *
* * *** * *
* * * * *
* ***** * *
* * *
********* *
Observe how the top part is nice and symmetrical. Observe how the bottom part has a vertical line down the right side, but is otherwise much like the top. Note the pattern in every second row at the top: an increasing number of stars on each side. Note that each intervening row is exactly the saw as the one before except it fills in the middle area with stars.
The function p(r,s) prints out the ith line of the top part of the spiral of width r and sticks the suffix s on the end. Note that i is a global variable, even though it might not be obvious! When i is even it fills the middle of the row with spaces, otherwise with stars. (The ~i%2 was a nasty way to get the effect of i%2==0, but is actually not necessary at all because I should have simply swapped the "*" and the " ".) We first draw the top rows of the spiral with increasing i, then we draw the bottom rows with decreasing i. We lower r by 2 and suffix " *" to get the column of stars on the right.
Java
328 characters
class S{
public static void main(String[]a){
int n=Integer.parseInt(a[0]),i=n*(n-2)/2-1,j=0,t=2,k;
char[]c=new char[n*n];
java.util.Arrays.fill(c,' ');
int[]d={1,n,-1,-n};
if(n/2%2==0){j=2;i+=1+n;}
c[i]='*';
while(t<n){
for(k=0;k<t;k++)c[i+=d[j]]='*';
j=(j+1)%4;
if(j%2==0)t+=2;
}
for(i=0;i<n-2;i++)System.out.println(new String(c,i*n,n));
}
}
As little as 1/6 more than Python seems not too bad ;)
Here's the same with proper indentation:
class S {
public static void main(String[] a) {
int n = Integer.parseInt(a[0]), i = n * (n - 2) / 2 - 1, j = 0, t = 2, k;
char[] c = new char[n * n];
java.util.Arrays.fill(c, ' ');
int[] d = { 1, n, -1, -n };
if (n / 2 % 2 == 0) {
j = 2;
i += 1 + n;
}
c[i] = '*';
while (t < n) {
for (k = 0; k < t; k++)
c[i += d[j]] = '*';
j = (j + 1) % 4;
if (j % 2 == 0)
t += 2;
}
for (i = 0; i < n - 2; i++)
System.out.println(new String(c, i * n, n));
}
}
F#, 267 chars
A lot of answers are starting with blanks and adding *s, but I think it may be easier to start with a starfield and add whitespace.
let n=int(System.Console.ReadLine())-2
let mutable x,y,d,A=n,n,[|1;0;-1;0|],
Array.init(n)(fun _->System.Text.StringBuilder(String.replicate(n+2)"*"))
for i=1 to n do for j=1 to(n-i+1)-i%2 do x<-x+d.[i%4];y<-y+d.[(i+1)%4];A.[y].[x]<-' '
Seq.iter(printfn"%O")A
For those looking for insight into how I golf, I happened to save a lot of progress along the way, which I present here with commentary. Not every program is quite right, but they're all honing in on a shorter solution.
First off, I looked for a pattern of how to paint the white:
*********
* *
* ***** *
* * * *
* *** * *
* * *
******* *
*********
*6543216*
*1*****5*
*2*212*4*
*3***1*3*
*41234*2*
*******1*
***********
* *
* ******* *
* * * *
* * *** * *
* * * * *
* ***** * *
* * *
********* *
***********
*876543218*
*1*******7*
*2*43214*6*
*3*1***3*5*
*4*212*2*4*
*5*****1*3*
*6123456*2*
*********1*
Ok, I see it. First program:
let Main() =
let n=int(System.Console.ReadLine())
let A=Array2D.create(n-2)n '*'
let mutable x,y,z,i=n-2,n-2,0,n-2
let d=[|0,-1;-1,0;0,1;1,0|] // TODO
while i>0 do
for j in 1..i-(if i%2=1 then 1 else 0)do
x<-x+fst d.[z]
y<-y+snd d.[z]
A.[y,x]<-'0'+char j
z<-(z+1)%4
i<-i-1
printfn"%A"A
Main()
I know that d, the tuple-array of (x,y)-diffs-modulo-4 can later be reduced by x and y both indexing into different portions of the same int-array, hence the TODO. The rest is straightforward based on the visual insight into 'whitespace painting'. I'm printing a 2D array, which is not right, need an array of strings, so:
let n=int(System.Console.ReadLine())
let s=String.replicate n "*"
let A=Array.init(n-2)(fun _->System.Text.StringBuilder(s))
let mutable x,y,z,i=n-2,n-2,0,n-2
let d=[|0,-1;-1,0;0,1;1,0|]
while i>0 do
for j in 1..i-(if i%2=1 then 1 else 0)do
x<-x+fst d.[z]
y<-y+snd d.[z]
A.[y].[x]<-' '
z<-(z+1)%4
i<-i-1
for i in 0..n-3 do
printfn"%O"A.[i]
Ok, now let's change the array of tuples into an array of int:
let n=int(System.Console.ReadLine())-2
let mutable x,y,z,i,d=n,n,0,n,[|0;-1;0;1;0|]
let A=Array.init(n)(fun _->System.Text.StringBuilder(String.replicate(n+2)"*"))
while i>0 do
for j in 1..i-i%2 do x<-x+d.[z];y<-y+d.[z+1];A.[y].[x]<-' '
z<-(z+1)%4;i<-i-1
A|>Seq.iter(printfn"%O")
The let for A can be part of the previous line. And z and i are mostly redundant, I can compute one in terms of the other.
let n=int(System.Console.ReadLine())-2
let mutable x,y,d,A=n,n,[|0;-1;0;1|],
Array.init(n)(fun _->System.Text.StringBuilder(String.replicate(n+2)"*"))
for i=n downto 1 do for j in 1..i-i%2 do x<-x+d.[(n-i)%4];y<-y+d.[(n-i+1)%4];A.[y].[x]<-' '
Seq.iter(printfn"%O")A
downto is long, re-do the math so I can go (up) to in the loop.
let n=int(System.Console.ReadLine())-2
let mutable x,y,d,A=n,n,[|1;0;-1;0|],
Array.init(n)(fun _->System.Text.StringBuilder(String.replicate(n+2)"*"))
for i=1 to n do for j in 1..(n-i+1)-i%2 do x<-x+d.[i%4];y<-y+d.[(i+1)%4];A.[y].[x]<-' '
Seq.iter(printfn"%O")A
A little more tightening yields the final solution.
Python : 238 - 221 - 209 characters
All comments welcome:
d=input();r=range
a=[[' ']*d for i in r(d-2)]
x=y=d/4*2
s=d%4-2
for e in r(3,d+1,2):
for j in r(y,y+s*e-s,s):a[x][j]='*';y+=s
for j in r(x,x+s*e-(e==d)-s,s):a[j][y]='*';x+=s
s=-s
for l in a:print''.join(l)
Groovy, 373 295 257 243 chars
Tried a recursive approach that builds up squares starting from the most extern one going inside.. I used Groovy.
*********
*********
*********
*********
*********
*********
******* *
*********
* *
* *
* *
* *
* * *
******* *
*********
* *
* ***** *
* ***** *
* *** * *
* * *
******* *
*********
* *
* ***** *
* * * *
* *** * *
* * *
******* *
and so on..
r=args[0] as int;o=r+1;c='*'
t=new StringBuffer('\n'*(r*r-r-2))
e(r,0)
def y(){c=c==' '?'*':' '}
def e(s,p){if (s==3)t[o*p+p..o*p+p+2]=c*s else{l=o*(p+s-3)+p+s-2;(p+0..<p+s-2).each{t[o*it+p..<o*it+p+s]=c*s};y();t[l..l]=c;e(s-2,p+1)}}
println t
readable one:
r=args[0] as int;o=r+1;c='*'
t=new StringBuffer('\n'*(r*r-r-2))
e(r,0)
def y(){c=c==' '?'*':' '}
def e(s,p){
if (s==3)
t[o*p+p..o*p+p+2]=c*s
else{
l=o*(p+s-3)+p+s-2
(p+0..<p+s-2).each{
t[o*it+p..<o*it+p+s]=c*s}
y()
t[l..l]=c
e(s-2,p+1)
}
}
println t
EDIT: improved by just filling squares and then overriding them (check new example): so I avoided to fill just the edge of the rect but the whole one.
Ruby, 237 chars
I'm new to code golf, so I'm way off the mark, but I figured I'd give it a shot.
x=ARGV[0].to_i
y=x-2
s,h,j,g=' ',x-1,y-1,Array.new(y){Array.new(x,'*')}
(1..x/2+2).step(2){|d|(d..y-d).each{|i|g[i][h-d]=s}
(d..h-d).each{|i|g[d][i]=s}
(d..j-d).each{|i|g[i][d]=s}
(d..h-d-2).each{|i|g[j-d][i]=s}}
g.each{|r|print r;puts}
Long version
Java, 265 250 245 240 chars
Rather than preallocating a rectangular buffer and filling it in, I just loop over x/y coordinates and output '*' or ' ' for the current position. For this, we need an algorithm which can evaluate arbitrary points for whether they're on the spiral. The algorithm I used is based on the observation that the spiral is equivalent to a collection of concentric squares, with the exception of a set of positions which all happen along a particular diagonal; these positions require a correction (they must be inverted).
The somewhat readable version:
public class Spr2 {
public static void main(String[] args) {
int n = Integer.parseInt(args[0]);
int cy = (n - 5) / 4 * 2 + 1;
int cx = cy + 2;
for (int y = n - 3; y >= 0; y--) {
for (int x = 0; x < n; x++) {
int dx = cx - x;
int dy = cy - y;
int adx = Math.abs(dx);
int ady = Math.abs(dy);
boolean c = (dx > 0 && dx == dy + 1);
boolean b = ((adx % 2 == 1 && ady <= adx) || (ady % 2 == 1 && adx <= ady)) ^ c;
System.out.print(b ? '*' : ' ');
}
System.out.println();
}
}
}
A brief explanation for the above:
cx,cy = center
dx,dy = delta from center
adx,ady = abs(delta from center)
c = correction factor (whether to invert)
b = the evaluation
Optimized down. 265 chars:
public class S{
public static void main(String[]a){
int n=Integer.parseInt(a[0]),c=(n-5)/4*2+1,d=c+2,e,f,g,h,x,y;
for(y=0;y<n-2;y++){
for(x=0;x<=n;x++){
e=d-x;f=c-y;g=e>0?e:-e;h=f>0?f:-f;
System.out.print(x==n?'\n':(g%2==1&&h<=g||h%2==1&&g<=h)^(e>0&&e==f+1)?'*':' ');
}}}}
Updated. Now down to 250 chars:
class S{
public static void main(String[]a){
int n=Integer.parseInt(a[0]),c=(n-5)/4*2+1,d=c+2,g,h,x,y;
for(y=-c;y<n-2-c;y++){
for(x=-d;x<=n-d;x++){
g=x>0?x:-x;h=y>0?y:-y;
System.out.print(x==n-d?'\n':(g%2==1&&h<=g||h%2==1&&g<=h)^(x<0&&x==y-1)?'*':' ');
}}}}
Shaved just a few more characters. 245 chars:
class S{
public static void main(String[]a){
int n=Integer.parseInt(a[0]),c=(n-5)/4*2+1,d=c+2,g,h,x,y=-c;
for(;y<n-2-c;y++){
for(x=-d;x<=n-d;x++){
g=x>0?x:-x;h=y>0?y:-y;
System.out.print(x==n-d?'\n':(g%2==1&h<=g|h%2==1&g<=h)^(x<0&x==y-1)?'*':' ');
}}}}
Shaved just a few more characters. 240 chars:
class S{
public static void main(String[]a){
int n=Byte.decode(a[0]),c=(n-5)/4*2+1,d=c+2,g,h,x,y=-c;
for(;y<n-2-c;y++){
for(x=-d;x<=n-d;x++){
g=x>0?x:-x;h=y>0?y:-y;
System.out.print(x==n-d?'\n':(g%2==1&h<=g|h%2==1&g<=h)^(x<0&x==y-1)?'*':' ');
}}}}
OCaml, 299 chars
Here is a solution in OCaml, not the shortest but I believe quite readable.
It only uses string manipulations using the fact the you can build a spiral by mirroring the previous one.
Let's say you start at with n = 5:
55555
5 5
555 5
Now with n = 7:
7777777
7 7
5 555 7
5 5 7
55555 7
Did you see where all the 5's went ?
Here is the unobfuscated code using only the limited library provided with OCaml:
(* The standard library lacks a function to reverse a string *)
let rev s =
let n = String.length s - 1 in
let r = String.create (n + 1) in
for i = 0 to n do
r.[i] <- s.[n - i]
done;
r
;;
let rec f n =
if n = 5 then
[
"*****";
"* *";
"*** *"
]
else
[
String.make n '*';
"*" ^ (String.make (n - 2) ' ') ^ "*"
] # (
List.rev_map (fun s -> (rev s) ^ " *") (f (n - 2))
)
;;
let p n =
List.iter print_endline (f n)
;;
let () = p (read_int ());;
Here is the obfuscated version which is 299 characters long:
open String
let rev s=
let n=length s-1 in
let r=create(n+1)in
for i=0 to n do r.[i]<-s.[n-i]done;r
let rec f n=
if n=5 then["*****";"* *";"*** *"]else
[make n '*';"*"^(make (n-2) ' ')^"*"]
#(List.rev_map(fun s->(rev s)^" *")(f(n-2)));;
List.iter print_endline (f(read_int ()))
C#, 292 262 255 chars
Simple approach: draw the spiral line by line from the outside in.
using C=System.Console;class P{static void Main(string[]a){int A=
1,d=1,X=int.Parse(a[0]),Y=X-2,l=X,t=0,i,z;while(l>2){d*=A=-A;l=l<
4?4:l;for(i=1;i<(A<0?l-2:l);i++){C.SetCursorPosition(X,Y);C.Write
("*");z=A<0?Y+=d:X+=d;}if(t++>1||l<5){l-=2;t=1;}}C.Read();}}
Ruby (1.9.2) — 126
f=->s{s<0?[]:(z=?**s;[" "*s]+(s<2?[]:[z]+f[s-4]<<?*.rjust(s))).map{|i|"* #{i} *"}<<z+"** *"}
s=gets.to_i;puts [?**s]+f[s-4]
Perl, where are you? )
If I have a integer number n, how can I find the next number k > n such that k = 2^i, with some i element of N by bitwise shifting or logic.
Example: If I have n = 123, how can I find k = 128, which is a power of two, and not 124 which is only divisible by two. This should be simple, but it eludes me.
For 32-bit integers, this is a simple and straightforward route:
unsigned int n;
n--;
n |= n >> 1; // Divide by 2^k for consecutive doublings of k up to 32,
n |= n >> 2; // and then or the results.
n |= n >> 4;
n |= n >> 8;
n |= n >> 16;
n++; // The result is a number of 1 bits equal to the number
// of bits in the original number, plus 1. That's the
// next highest power of 2.
Here's a more concrete example. Let's take the number 221, which is 11011101 in binary:
n--; // 1101 1101 --> 1101 1100
n |= n >> 1; // 1101 1100 | 0110 1110 = 1111 1110
n |= n >> 2; // 1111 1110 | 0011 1111 = 1111 1111
n |= n >> 4; // ...
n |= n >> 8;
n |= n >> 16; // 1111 1111 | 1111 1111 = 1111 1111
n++; // 1111 1111 --> 1 0000 0000
There's one bit in the ninth position, which represents 2^8, or 256, which is indeed the next largest power of 2. Each of the shifts overlaps all of the existing 1 bits in the number with some of the previously untouched zeroes, eventually producing a number of 1 bits equal to the number of bits in the original number. Adding one to that value produces a new power of 2.
Another example; we'll use 131, which is 10000011 in binary:
n--; // 1000 0011 --> 1000 0010
n |= n >> 1; // 1000 0010 | 0100 0001 = 1100 0011
n |= n >> 2; // 1100 0011 | 0011 0000 = 1111 0011
n |= n >> 4; // 1111 0011 | 0000 1111 = 1111 1111
n |= n >> 8; // ... (At this point all bits are 1, so further bitwise-or
n |= n >> 16; // operations produce no effect.)
n++; // 1111 1111 --> 1 0000 0000
And indeed, 256 is the next highest power of 2 from 131.
If the number of bits used to represent the integer is itself a power of 2, you can continue to extend this technique efficiently and indefinitely (for example, add a n >> 32 line for 64-bit integers).
There is actually a assembly solution for this (since the 80386 instruction set).
You can use the BSR (Bit Scan Reverse) instruction to scan for the most significant bit in your integer.
bsr scans the bits, starting at the
most significant bit, in the
doubleword operand or the second word.
If the bits are all zero, ZF is
cleared. Otherwise, ZF is set and the
bit index of the first set bit found,
while scanning in the reverse
direction, is loaded into the
destination register
(Extracted from: http://dlc.sun.com/pdf/802-1948/802-1948.pdf)
And than inc the result with 1.
so:
bsr ecx, eax //eax = number
jz #zero
mov eax, 2 // result set the second bit (instead of a inc ecx)
shl eax, ecx // and move it ecx times to the left
ret // result is in eax
#zero:
xor eax, eax
ret
In newer CPU's you can use the much faster lzcnt instruction (aka rep bsr). lzcnt does its job in a single cycle.
A more mathematical way, without loops:
public static int ByLogs(int n)
{
double y = Math.Floor(Math.Log(n, 2));
return (int)Math.Pow(2, y + 1);
}
Here's a logic answer:
function getK(int n)
{
int k = 1;
while (k < n)
k *= 2;
return k;
}
Here's John Feminella's answer implemented as a loop so it can handle Python's long integers:
def next_power_of_2(n):
"""
Return next power of 2 greater than or equal to n
"""
n -= 1 # greater than OR EQUAL TO n
shift = 1
while (n+1) & n: # n+1 is not a power of 2 yet
n |= n >> shift
shift <<= 1
return n + 1
It also returns faster if n is already a power of 2.
For Python >2.7, this is simpler and faster for most N:
def next_power_of_2(n):
"""
Return next power of 2 greater than or equal to n
"""
return 2**(n-1).bit_length()
This answer is based on constexpr to prevent any computing at runtime when the function parameter is passed as const
Greater than / Greater than or equal to
The following snippets are for the next number k > n such that k = 2^i
(n=123 => k=128, n=128 => k=256) as specified by OP.
If you want the smallest power of 2 greater than OR equal to n then just replace __builtin_clzll(n) by __builtin_clzll(n-1) in the following snippets.
C++11 using GCC or Clang (64 bits)
#include <cstdint> // uint64_t
constexpr uint64_t nextPowerOfTwo64 (uint64_t n)
{
return 1ULL << (sizeof(uint64_t) * 8 - __builtin_clzll(n));
}
Enhancement using CHAR_BIT as proposed by martinec
#include <cstdint>
constexpr uint64_t nextPowerOfTwo64 (uint64_t n)
{
return 1ULL << (sizeof(uint64_t) * CHAR_BIT - __builtin_clzll(n));
}
C++17 using GCC or Clang (from 8 to 128 bits)
#include <cstdint>
template <typename T>
constexpr T nextPowerOfTwo64 (T n)
{
T clz = 0;
if constexpr (sizeof(T) <= 32)
clz = __builtin_clzl(n); // unsigned long
else if (sizeof(T) <= 64)
clz = __builtin_clzll(n); // unsigned long long
else { // See https://stackoverflow.com/a/40528716
uint64_t hi = n >> 64;
uint64_t lo = (hi == 0) ? n : -1ULL;
clz = _lzcnt_u64(hi) + _lzcnt_u64(lo);
}
return T{1} << (CHAR_BIT * sizeof(T) - clz);
}
Other compilers
If you use a compiler other than GCC or Clang, please visit the Wikipedia page listing the Count Leading Zeroes bitwise functions:
Visual C++ 2005 => Replace __builtin_clzl() by _BitScanForward()
Visual C++ 2008 => Replace __builtin_clzl() by __lzcnt()
icc => Replace __builtin_clzl() by _bit_scan_forward
GHC (Haskell) => Replace __builtin_clzl() by countLeadingZeros()
Contribution welcome
Please propose improvements within the comments. Also propose alternative for the compiler you use, or your programming language...
See also similar answers
nulleight's answer
ydroneaud's answer
Here's a wild one that has no loops, but uses an intermediate float.
// compute k = nextpowerof2(n)
if (n > 1)
{
float f = (float) n;
unsigned int const t = 1U << ((*(unsigned int *)&f >> 23) - 0x7f);
k = t << (t < n);
}
else k = 1;
This, and many other bit-twiddling hacks, including the on submitted by John Feminella, can be found here.
assume x is not negative.
int pot = Integer.highestOneBit(x);
if (pot != x) {
pot *= 2;
}
If you use GCC, MinGW or Clang:
template <typename T>
T nextPow2(T in)
{
return (in & (T)(in - 1)) ? (1U << (sizeof(T) * 8 - __builtin_clz(in))) : in;
}
If you use Microsoft Visual C++, use function _BitScanForward() to replace __builtin_clz().
function Pow2Thing(int n)
{
x = 1;
while (n>0)
{
n/=2;
x*=2;
}
return x;
}
Bit-twiddling, you say?
long int pow_2_ceil(long int t) {
if (t == 0) return 1;
if (t != (t & -t)) {
do {
t -= t & -t;
} while (t != (t & -t));
t <<= 1;
}
return t;
}
Each loop strips the least-significant 1-bit directly. N.B. This only works where signed numbers are encoded in two's complement.
What about something like this:
int pot = 1;
for (int i = 0; i < 31; i++, pot <<= 1)
if (pot >= x)
break;
You just need to find the most significant bit and shift it left once. Here's a Python implementation. I think x86 has an instruction to get the MSB, but here I'm implementing it all in straight Python. Once you have the MSB it's easy.
>>> def msb(n):
... result = -1
... index = 0
... while n:
... bit = 1 << index
... if bit & n:
... result = index
... n &= ~bit
... index += 1
... return result
...
>>> def next_pow(n):
... return 1 << (msb(n) + 1)
...
>>> next_pow(1)
2
>>> next_pow(2)
4
>>> next_pow(3)
4
>>> next_pow(4)
8
>>> next_pow(123)
128
>>> next_pow(222)
256
>>>
Forget this! It uses loop !
unsigned int nextPowerOf2 ( unsigned int u)
{
unsigned int v = 0x80000000; // supposed 32-bit unsigned int
if (u < v) {
while (v > u) v = v >> 1;
}
return (v << 1); // return 0 if number is too big
}
private static int nextHighestPower(int number){
if((number & number-1)==0){
return number;
}
else{
int count=0;
while(number!=0){
number=number>>1;
count++;
}
return 1<<count;
}
}
// n is the number
int min = (n&-n);
int nextPowerOfTwo = n+min;
#define nextPowerOf2(x, n) (x + (n-1)) & ~(n-1)
or even
#define nextPowerOf2(x, n) x + (x & (n-1))