SystemVerilog/Verilog: Is there a way to find the integer bit offset of a field of a packed struct? - function

I was wondering if there is a standard function in verilog or systemverilog that will return the bit offset of a certain field within a packed struct. For instance, see use of hypothetical function $find_field_offset below:
typedef struct packed
{
logic [31:0] field_1,
logic [15:0] field_2,
logic [63:0] field_3
} struct_type;
struct_type the_struct;
int field_1_offset;
assign field_1_offset = $find_field_offset(the_struct.field_1);
Thanks!

It might not be a good and convenient way. But this is the native SV code to find out the offset field_1 inside struct_type
function automatic int get_offset_struct_type_field_1;
struct_type x = '0;
x.field_1 = '1;
for (integer i = 0; i < $bits(x); i=i+1) begin
if (x[i] == 1) return i;
end
endfunction

Thanks for the examples. Below are some pedantic checks and messaging to confirm offsets and sizes of all fields of some hierarchical structs. char_idx is the character position of a nibble in the struct if read as a string from a $writememh or similar. This taks is invoked within an initial block to confirm downstream parsing will be able to interpret the hexadecimal representation correctly.
task report_offsets(
);
phy_mon_info_s tr;
integer char_idx;
$display("tr is made of %0d bits",$bits(tr));
for (integer idx = 0;idx< $bits(tr);idx++) begin
char_idx = ($bits(tr) - idx - 1) >> 2;
tr = 1'b1 << idx;
if (|tr.inst > 0) $display("tr.inst claims bit %0d hex char %0d" ,idx, char_idx);
if (|tr.clock_count) $display("tr.clock_count claims bit %0d hex char %0d" ,idx, char_idx);
if (|tr.phy_info.dir) $display("tr.phy_info.dir claims bit %0d hex char %0d" ,idx, char_idx);
if (|tr.phy_info.data_type) $display("tr.phy_info.data_type claims bit %0d hex char %0d" ,idx, char_idx);
for (int inner = 0;inner< PHY_MON_FRAME_DWS;inner++) begin
if (|tr.phy_info.header[inner]) $display("tr.phy_info.header[%0d] claims bit %0d hex char %0d",inner,idx, char_idx);
end
if (|tr.phy_info.payload_dws) $display("tr.phy_info.payload_dws claims bit %0d hex char %0d",idx, char_idx);
if (|tr.phy_info.prim) $display("tr.phy_info.prim claims bit %0d hex char %0d" ,idx, char_idx);
if (|tr.phy_info.num_prims) $display("tr.phy_info.num_prims claims bit %0d hex char %0d" ,idx, char_idx);
if (|tr.phy_clock) $display("tr.phy_info.phy_clk claims bit %0d hex char %0d" ,idx, char_idx);
end
assert($bits(tr.inst ) % 4 == 0) else $error("total bits in tr.inst %0d is not a multiple of 4!",$bits(tr.inst ));
assert($bits(tr.clock_count ) % 4 == 0) else $error("total bits in tr.clock_count %0d is not a multiple of 4!",$bits(tr.clock_count ));
assert($bits(tr.phy_info.dir ) % 4 == 0) else $error("total bits in tr.phy_info.dir %0d is not a multiple of 4!",$bits(tr.phy_info.dir ));
assert($bits(tr.phy_info.data_type ) % 4 == 0) else $error("total bits in tr.phy_info.data_type %0d is not a multiple of 4!",$bits(tr.phy_info.data_type ));
assert($bits(tr.phy_info.header ) % 4 == 0) else $error("total bits in tr.phy_info.header %0d is not a multiple of 4!",$bits(tr.phy_info.header ));
assert($bits(tr.phy_info.payload_dws ) % 4 == 0) else $error("total bits in tr.phy_info.payload_dws %0d is not a multiple of 4!",$bits(tr.phy_info.payload_dws ));
assert($bits(tr.phy_info.prim ) % 4 == 0) else $error("total bits in tr.phy_info.prim %0d is not a multiple of 4!",$bits(tr.phy_info.prim ));
assert($bits(tr.phy_info.num_prims ) % 4 == 0) else $error("total bits in tr.phy_info.num_prims %0d is not a multiple of 4!",$bits(tr.phy_info.num_prims ));
assert($bits(tr.phy_clock ) % 4 == 0) else $error("total bits in tr.phy_clock %0d is not a multiple of 4!",$bits(tr.phy_clock ));
assert($bits(tr ) % 4 == 0) else $error("total bits in tr %0d is not a multiple of 4!",$bits(tr ));
endtask

I used #jclin's function and turned it into a macro, so you can call it like a function, no need to replicate code.
`define get_offset(struct_name, field_name, bit_offset)\
struct_name = '0;\
struct_name.field_name = '1;\
for (integer i = 0; i < $bits(struct_name); i=i+1) begin\
if (struct_name[i] == 1) begin\
$display("%s offset=%4d", `"field_name`", i);\
bit_offset = i;\
break;\
end\
end
example_struct_t example_struct;
int return_value;
initial begin
`get_offset(example_struct, field_0, return_value);
`get_offset(example_struct, field_1, return_value);
`get_offset(example_struct, field_2.subfield_0, return_value);
end
The return value is useful if you want to collect the values and do some further calculation

It doesn't really make sense to have a function that returns this in the form that you presented. The reason is that you don't have anything variable in there. You just pass in a value to the method and there is nothing the compiler could use to determine that you want to get the offset of field1 in struct_type.
The best you can do with native SystemVerilog is to define your own function that returns the offset based on an enumerated argument. Off the top of my head:
typedef enum { FIELD1, FIELD2, FIELD3 } struct_type_fields_e;
function int unsigned get_offset(struct_type_fields_e field);
case (field)
FIELD1 : return 0;
FIELD2 : return 32;
FIELD3 : return 48;
endcase
endfunction
You could probably do more with some VPI code, but you would need to change the way you call your function.

If you're ok with using the VPI, then you can do what you want with a little bit of type introspection. The problem with this is that you can't call the function in the way you like, because, to my knowledge, the compiler loses the context of what field1 actually is. What I mean by this is that the function would see a logic vector value, but not know it originated from a struct.
If you're okay with changing the function call to:
$find_field_offset(the_struct, "field1"); // note "field1" in quotes
then it would technically be possible to figure out that the_struct is of type struct_type and loop over all of its fields to find the field called "field1" and return the offset of that.
The problem with using VPI code is that support for the VPI object model varies from vendor to vendor. You have to be lucky enough to use a vendor that has support for the functions we would need here.

Related

Read a binary array with mixed types

I am writing a function similar to this to read in binary formatted .ply files.
The linked function reads the header and the skips to the binary array, reading that in with numpy and I would like to do the same in Octave.
My code for reading the header is
fid = fopen('/path/to/file.ply');
tline = fgetl(fid); % read first line
len = 0;
prop = {};
dtype = {};
fmt = 'binary';
while ~strcmp(tline, "end_header")
len = len + length(tline) + 1; % increase header length, +1 includes EOL
tline = strsplit(tline); % split string
if strcmp('format', tline{1}) && strcmp('ascii', tline{2}) % test whether file is ascii
fmt = 'ascii';
end
if strcmp('element', tline{1}) && strcmp('vertex', tline{2}) % number of points
N = tline{3};
end
if strcmp('property', tline{1}) % identify fields
dtype = [dtype, tline{2}];
prop = [prop, tline{3}];
end
tline = fgetl(fid);
end
len = len + length(tline) + 1; % add 'end_header' to len
So I have arrays of data types
dtype =
{
[1,1] = float
[1,2] = float
[1,3] = float
[1,4] = int
[1,5] = int
[1,6] = int
[1,7] = float
[1,8] = float
[1,9] = float
}
and I know the shape of the array.
N = 61415
Is there a function that replicates numpy's fromfile and can I seek to the right location in my file (I know where the binary data starts in the file as I have len)
Following #tasos-papastylianou answer I tried
fseek(fid, len);
fread(fid, 3, 'float')
Which returns the correct 3 values, but the next value is an integer and therefore gives the incorrect answer.
fread(fid, 4, 'float')
arr =
-1.4298e+00
-5.3943e+00
1.6623e+01
1.5274e-43 <<<< should be 109
My solution
function pts = read_ply(fn)
fid = fopen(fn);
tline = fgetl(fid); % read first line
len = 0;
prop = {};
% dtype_map = {'float': 'f4', 'uchar': 'B', 'int':'i'}
dtype = {};
fmt = 'binary';
while ~strcmp(tline, "end_header")
len = len + length(tline) + 1; % increase header length, +1 includes EOL
tline = strsplit(tline); % split string
if strcmp('format', tline{1}) && strcmp('ascii', tline{2}) % test whether file is ascii
fmt = 'ascii';
elseif strcmp('element', tline{1}) && strcmp('vertex', tline{2}) % number of points
N = str2num(tline{3});
elseif strcmp('property', tline{1}) % identify fields
dtype = [dtype, tline{2}];
prop = [prop, tline{3}];
endif
tline = fgetl(fid);
endwhile
len = len + length(tline) + 1; % add 'end_header
% total file length minus header
fseek(fid, 0, 1);
file_length = ftell(fid) - len;
types = struct('float', 4, 'int', 4, 'float64', 8);
pts = struct();
seek_plus = 0;
for i = 1:length(prop)
fseek(fid, len + seek_plus);
dt = types.(dtype{i}); % dtype for field
pts.(prop{i}) = fread(fid, N, dtype{i}, int32(file_length / N) - dt);
seek_plus = seek_plus + dt;
endfor
This does not answer my original question as it involves a loop, but it seems fairly efficient. Arrays can be constructed as so.
xyz = [pts.x, pts.y, pts.z];
Your question confuses me a bit towards the end, since the most direct equivalent to numpy's saving an array in a numpy-specific binary format is octave's save which saves an array to an octave-specific binary format.
Having said that, this doesn't sound like what you want so I'm assuming the fromfile reference is a red herring.
In general if you have a binary file you want to open, read, or seek (i.e. place the cursor at a particular position), you can use the fopen, fread, and fseek commands. Also useful, ftell, frewind, etc.
These are all fairly simple commands. Just have a look at their documentation in the terminal (e.g. help fseek ).

Static const array in Cuda kernel

I need to have the following in the Cuda kernel:
static const float PREDEFINED_CONSTS[16] = {...}; // 16 constants.
float c = PREDEFINED_CONSTS[threadId.x % 16];
/// Use c in computations.
What's the best way to provide PREDEFINED_CONSTS ?
Const memory does't seem good, cause different threads will access different locations.
If I define them as above, will PREDEFINED_CONSTS be stored in global memory?
What about this:
float c;
if ( threadId.x % 16 == 0 ) c = VAL0;
else if ( threadId.x % 16 == 1 ) c = VAL1;
...
else if ( threadId.x % 16 ==15 ) c = VAL15;
Although last example has thread divergence, literal VAL* values are part of the instruction opcode, so there will be no reading from memory.
What's the best way to provide PREDEFINED_CONSTS ?
If it were me, I would simply put what you have in your first example in your CUDA kernel and go with that. That is very likely the best way to do it. Later on, if you feel like you have a performance problem with your code, you can use a profiler to steer you in the direction of what needs to be addressed. I doubt it would be this. For constants, there really are only 2 possibilities:
Load them from some kind of memory
Load them as part of the instruction stream.
You've already indicated you are aware of this, you can simply benchmark both if you're really worried. Benchmarking would require more than what you have shown here, might be inconclusive, and may also depend on other factors such as how many times and in what way you are loading these constants.
As you have indicated already, __constant__ doesn't seem to be a sensible choice because the load pattern is clearly non-uniform, across the warp.
If I define them as above, will PREDEFINED_CONSTS be stored in global memory?
Yes, your first method will be stored in global memory. This can be confirmed with careful study and compilation using -Xptxas -v. Your second method has the potential (at least) to load the constants via the instruction stream. Since the second method is quite ugly from a coding perspective, and also very inflexible compared to the first method (what if I needed different constants per thread in different places in my code?), it's not what I would choose.
This strikes me as premature optimization. The first method is clearly preferred from a code flexibility and conciseness standpoint, and there is no real reason to think that simply because you are loading from memory that it is a problem. The second method is ugly, inflexible, and may not be any better from a performance perspective. Even if the data is part of the instruction stream, it still has to be loaded from memory.
Here's an example test case suggesting to me that the first case is preferred. If you come up with a different kind of test case, you may end up with a different observation:
$ cat t97.cu
#include <cstdio>
const float VAL0 = 1.1;
const float VAL1 = 2.2;
const float VAL2 = 3;
const float VAL3 = 4;
const float VAL4 = 5;
const float VAL5 = 6;
const float VAL6 = 7;
const float VAL7 = 8;
const float VAL8 = 9;
const float VAL9 = 10;
const float VAL10 = 11;
const float VAL11 = 12;
const float VAL12 = 13;
const float VAL13 = 14;
const float VAL14 = 15;
const float VAL15 = 16;
__global__ void k1(int l){
static const float PREDEFINED_CONSTS[16] = {VAL0, VAL1, VAL2, VAL3, VAL4, VAL5, VAL6, VAL7, VAL8, VAL9, VAL10, VAL11, VAL12, VAL13, VAL14, VAL15};
float sum = 0.0;
for (int i = 0; i < l; i++)
sum += PREDEFINED_CONSTS[(threadIdx.x+i) & 15];
if (sum == 0.0) printf("%f\n", sum);
}
__device__ float get_const(int i){
float c = VAL15;
unsigned t = (threadIdx.x+i) & 15;
if (t == 0) c = VAL0;
else if (t == 1) c = VAL1;
else if (t == 2) c = VAL2;
else if (t == 3) c = VAL3;
else if (t == 4) c = VAL4;
else if (t == 5) c = VAL5;
else if (t == 6) c = VAL6;
else if (t == 7) c = VAL7;
else if (t == 8) c = VAL8;
else if (t == 9) c = VAL9;
else if (t == 10) c = VAL10;
else if (t == 11) c = VAL11;
else if (t == 12) c = VAL12;
else if (t == 13) c = VAL13;
else if (t == 14) c = VAL14;
return c;
}
__global__ void k2(int l){
float sum = 0.0;
for (int i = 0; i < l; i++)
sum += get_const(i);
if (sum == 0.0) printf("%f\n", sum);
}
int main(){
int l = 1048576;
k1<<<1,16>>>(l);
k2<<<1,16>>>(l);
cudaDeviceSynchronize();
}
$ nvcc -o t97 t97.cu -Xptxas -v
ptxas info : 68 bytes gmem
ptxas info : Compiling entry function '_Z2k2i' for 'sm_52'
ptxas info : Function properties for _Z2k2i
8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 9 registers, 324 bytes cmem[0], 8 bytes cmem[2]
ptxas info : Compiling entry function '_Z2k1i' for 'sm_52'
ptxas info : Function properties for _Z2k1i
8 bytes stack frame, 0 bytes spill stores, 0 bytes spill loads
ptxas info : Used 32 registers, 324 bytes cmem[0]
$ nvprof ./t97
==22848== NVPROF is profiling process 22848, command: ./t97
==22848== Profiling application: ./t97
==22848== Profiling result:
Type Time(%) Time Calls Avg Min Max Name
GPU activities: 91.76% 239.39ms 1 239.39ms 239.39ms 239.39ms k2(int)
8.24% 21.508ms 1 21.508ms 21.508ms 21.508ms k1(int)
API calls: 62.34% 260.89ms 1 260.89ms 260.89ms 260.89ms cudaDeviceSynchronize
37.48% 156.85ms 2 78.427ms 10.319us 156.84ms cudaLaunchKernel
0.13% 542.39us 202 2.6850us 192ns 117.71us cuDeviceGetAttribute
0.04% 156.19us 2 78.094us 58.411us 97.777us cuDeviceTotalMem
0.01% 59.150us 2 29.575us 26.891us 32.259us cuDeviceGetName
0.00% 10.845us 2 5.4220us 1.7280us 9.1170us cuDeviceGetPCIBusId
0.00% 1.6860us 4 421ns 216ns 957ns cuDeviceGet
0.00% 1.5850us 3 528ns 283ns 904ns cuDeviceGetCount
0.00% 667ns 2 333ns 296ns 371ns cuDeviceGetUuid
$

Verilog conditional assign outputs X where there should be 1

I am currently building a sign extender in Verilog based on the one present in the ARMv8 processor, but after the first result is extended, every subsequent result makes a 1 in the output into an X. How do I get rid of the X?
The module and the quick test bench I made are shown below.
Sign Extender:
`timescale 1ns / 1ps
module SignExtender(BusImm, ImmIns);
output [63:0] BusImm;
input [31:0] ImmIns;
wire extBit;
assign extBit = (ImmIns[31:26] == 6'bx00101) ? ImmIns[25]:
(ImmIns[31:24] == 8'bxxx10100) ? ImmIns[23]:
(ImmIns[31:21] == 11'bxxxx1000xx0) ? ImmIns[20]:
1'b0;
assign BusImm = (ImmIns[31:26] == 6'bx00101) ? {{38{extBit}}, ImmIns[25:0]}:
(ImmIns[31:24] == 8'bxxx10100) ? {{45{extBit}}, ImmIns[23:5]}:
(ImmIns[31:21] == 11'bxxxx1000xx0) ? {{55{extBit}}, ImmIns[20:12]}:
64'b0;
assign BusImm = 64'b0;
endmodule
Test Bench:
`timescale 1ns / 1ps
`define STRLEN 32
`define HalfClockPeriod 60
`define ClockPeriod `HalfClockPeriod * 2
module SignExtenderTest;
task passTest;
input [63:0] actualOut, expectedOut;
input [`STRLEN*8:0] testType;
inout [7:0] passed;
if(actualOut == expectedOut) begin $display ("%s passed", testType); passed = passed + 1; end
else $display ("%s failed: 0x%x should be 0x%x", testType, actualOut, expectedOut);
endtask
task allPassed;
input [7:0] passed;
input [7:0] numTests;
if(passed == numTests) $display ("All tests passed");
else $display("Some tests failed: %d of %d passed", passed, numTests);
endtask
reg [7:0] passed;
reg [31:0] in;
wire [63:0] out;
SignExtender uut (
.BusImm(out),
.ImmIns(in)
);
initial begin
passed = 0;
in = 32'hF84003E9;
#10;
begin
passTest(out, 63'b0, "Stuff", passed);
#10;
in = 32'hf84093ea;
#10;
passTest(out, 63'b0, "Stuff", passed);
end
end
endmodule
You seem to be treating x as a "don't-care" value in your comparisons, but it is not. x is a specific value which represents "unknown". Since you drive your input signals to all known values (0 or 1), all your == comparisons resolve to x, and your output has x in it. You should only compare bits you are interested in. For example, change:
(ImmIns[31:21] == 11'bxxxx1000xx0) ? {{55{extBit}}, ImmIns[20:12]}:
to:
( (ImmIns[27:24] == 4'b1000) && (ImmIns[21] == 1'b0) ) ? {{55{extBit}}, ImmIns[20:12]}:
You need to make similar changes to all your comparisons.
Also, you drive BusImm with 2 continuous assignments. Get rid of this line:
assign BusImm = 64'b0;
These changes get the x out of your output.
Also consider using casez. Refer to IEEE Std 1800-2017, section 12.5.1 Case statement with do-not-cares.

How to print a number in Assembly 8086?

I'm trying to write a function that receives a number (which I pushed earlier), and prints it. How can I do it?
What I have so far:
org 100h
push 10
call print_num
print_num:
push bp
mov bp, sp
mov ax, [bp+2*2]
mov bx, cs
mov es, bx
mov dx, string
mov di, dx
stosw
mov ah, 09h
int 21h
pop bp
ret
string:
What you're placing at the address of string is a numerical value, not the string representation of that value.
The value 12 and the string "12" are two separate things. Seen as a 16-bit hexadecimal value, 12 would be 0x000C while "12" would be 0x3231 (0x32 == '2', 0x31 == '1').
You need to convert the numerical value into its string representation and then print the resulting string.Rather than just pasting a finished solution I'll show a simple way of how this could be done in C, which should be enough for you to base an 8086 implementation on:
char string[8], *stringptr;
short num = 123;
string[7] = '$'; // DOS string terminator
// The string will be filled up backwards
stringptr = string + 6;
while (stringptr >= string) {
*stringptr = '0' + (num % 10); // '3' on the first iteration, '2' on the second, etc
num /= 10; // 123 => 12 => 1 => 0
if (num == 0) break;
stringptr--;
}

How can I reverse the ON bits in a byte?

I was reading Joel's book where he was suggesting as interview question:
Write a program to reverse the "ON" bits in a given byte.
I only can think of a solution using C.
Asking here so you can show me how to do in a Non C way (if possible)
I claim trick question. :) Reversing all bits means a flip-flop, but only the bits that are on clearly means:
return 0;
What specifically does that question mean?
Good question. If reversing the "ON" bits means reversing only the bits that are "ON", then you will always get 0, no matter what the input is. If it means reversing all the bits, i.e. changing all 1s to 0s and all 0s to 1s, which is how I initially read it, then that's just a bitwise NOT, or complement. C-based languages have a complement operator, ~, that does this. For example:
unsigned char b = 102; /* 0x66, 01100110 */
unsigned char reverse = ~b; /* 0x99, 10011001 */
What specifically does that question mean?
Does reverse mean setting 1's to 0's and vice versa?
Or does it mean 00001100 --> 00110000 where you reverse their order in the byte? Or perhaps just reversing the part that is from the first 1 to the last 1? ie. 00110101 --> 00101011?
Assuming it means reversing the bit order in the whole byte, here's an x86 assembler version:
; al is input register
; bl is output register
xor bl, bl ; clear output
; first bit
rcl al, 1 ; rotate al through carry
rcr bl, 1 ; rotate carry into bl
; duplicate above 2-line statements 7 more times for the other bits
not the most optimal solution, a table lookup is faster.
Reversing the order of bits in C#:
byte ReverseByte(byte b)
{
byte r = 0;
for(int i=0; i<8; i++)
{
int mask = 1 << i;
int bit = (b & mask) >> i;
int reversedMask = bit << (7 - i);
r |= (byte)reversedMask;
}
return r;
}
I'm sure there are more clever ways of doing it but in that precise case, the interview question is meant to determine if you know bitwise operations so I guess this solution would work.
In an interview, the interviewer usually wants to know how you find a solution, what are you problem solving skills, if it's clean or if it's a hack. So don't come up with too much of a clever solution because that will probably mean you found it somewhere on the Internet beforehand. Don't try to fake that you don't know it neither and that you just come up with the answer because you are a genius, this is will be even worst if she figures out since you are basically lying.
If you're talking about switching 1's to 0's and 0's to 1's, using Ruby:
n = 0b11001100
~n
If you mean reverse the order:
n = 0b11001100
eval("0b" + n.to_s(2).reverse)
If you mean counting the on bits, as mentioned by another user:
n = 123
count = 0
0.upto(8) { |i| count = count + n[i] }
♥ Ruby
I'm probably misremembering, but I
thought that Joel's question was about
counting the "on" bits rather than
reversing them.
Here you go:
#include <stdio.h>
int countBits(unsigned char byte);
int main(){
FILE* out = fopen( "bitcount.c" ,"w");
int i;
fprintf(out, "#include <stdio.h>\n#include <stdlib.h>\n#include <time.h>\n\n");
fprintf(out, "int bitcount[256] = {");
for(i=0;i<256;i++){
fprintf(out, "%i", countBits((unsigned char)i));
if( i < 255 ) fprintf(out, ", ");
}
fprintf(out, "};\n\n");
fprintf(out, "int main(){\n");
fprintf(out, "srand ( time(NULL) );\n");
fprintf(out, "\tint num = rand() %% 256;\n");
fprintf(out, "\tprintf(\"The byte %%i has %%i bits set to ON.\\n\", num, bitcount[num]);\n");
fprintf(out, "\treturn 0;\n");
fprintf(out, "}\n");
fclose(out);
return 0;
}
int countBits(unsigned char byte){
unsigned char mask = 1;
int count = 0;
while(mask){
if( mask&byte ) count++;
mask <<= 1;
}
return count;
}
The classic Bit Hacks page has several (really very clever) ways to do this, but it's all in C. Any language derived from C syntax (notably Java) will likely have similar methods. I'm sure we'll get some Haskell versions in this thread ;)
byte ReverseByte(byte b)
{
return b ^ 0xff;
}
That works if ^ is XOR in your language, but not if it's AND, which it often is.
And here's a version directly cut and pasted from OpenJDK, which is interesting because it involves no loop. On the other hand, unlike the Scheme version I posted, this version only works for 32-bit and 64-bit numbers. :-)
32-bit version:
public static int reverse(int i) {
// HD, Figure 7-1
i = (i & 0x55555555) << 1 | (i >>> 1) & 0x55555555;
i = (i & 0x33333333) << 2 | (i >>> 2) & 0x33333333;
i = (i & 0x0f0f0f0f) << 4 | (i >>> 4) & 0x0f0f0f0f;
i = (i << 24) | ((i & 0xff00) << 8) |
((i >>> 8) & 0xff00) | (i >>> 24);
return i;
}
64-bit version:
public static long reverse(long i) {
// HD, Figure 7-1
i = (i & 0x5555555555555555L) << 1 | (i >>> 1) & 0x5555555555555555L;
i = (i & 0x3333333333333333L) << 2 | (i >>> 2) & 0x3333333333333333L;
i = (i & 0x0f0f0f0f0f0f0f0fL) << 4 | (i >>> 4) & 0x0f0f0f0f0f0f0f0fL;
i = (i & 0x00ff00ff00ff00ffL) << 8 | (i >>> 8) & 0x00ff00ff00ff00ffL;
i = (i << 48) | ((i & 0xffff0000L) << 16) |
((i >>> 16) & 0xffff0000L) | (i >>> 48);
return i;
}
pseudo code..
while (Read())
Write(0);
I'm probably misremembering, but I thought that Joel's question was about counting the "on" bits rather than reversing them.
Here's the obligatory Haskell soln for complementing the bits, it uses the library function, complement:
import Data.Bits
import Data.Int
i = 123::Int
i32 = 123::Int32
i64 = 123::Int64
var2 = 123::Integer
test1 = sho i
test2 = sho i32
test3 = sho i64
test4 = sho var2 -- Exception
sho i = putStrLn $ showBits i ++ "\n" ++ (showBits $complement i)
showBits v = concatMap f (showBits2 v) where
f False = "0"
f True = "1"
showBits2 v = map (testBit v) [0..(bitSize v - 1)]
If the question means to flip all the bits, and you aren't allowed to use C-like operators such as XOR and NOT, then this will work:
bFlipped = -1 - bInput;
I'd modify palmsey's second example, eliminating a bug and eliminating the eval:
n = 0b11001100
n.to_s(2).rjust(8, '0').reverse.to_i(2)
The rjust is important if the number to be bitwise-reversed is a fixed-length bit field -- without it, the reverse of 0b00101010 would be 0b10101 rather than the correct 0b01010100. (Obviously, the 8 should be replaced with the length in question.) I just got tripped up by this one.
Asking here so you can show me how to do in a Non C way (if possible)
Say you have the number 10101010. To change 1s to 0s (and vice versa) you just use XOR:
10101010
^11111111
--------
01010101
Doing it by hand is about as "Non C" as you'll get.
However from the wording of the question it really sounds like it's only turning off "ON" bits... In which case the answer is zero (as has already been mentioned) (unless of course the question is actually asking to swap the order of the bits).
Since the question asked for a non-C way, here's a Scheme implementation, cheerfully plagiarised from SLIB:
(define (bit-reverse k n)
(do ((m (if (negative? n) (lognot n) n) (arithmetic-shift m -1))
(k (+ -1 k) (+ -1 k))
(rvs 0 (logior (arithmetic-shift rvs 1) (logand 1 m))))
((negative? k) (if (negative? n) (lognot rvs) rvs))))
(define (reverse-bit-field n start end)
(define width (- end start))
(let ((mask (lognot (ash -1 width))))
(define zn (logand mask (arithmetic-shift n (- start))))
(logior (arithmetic-shift (bit-reverse width zn) start)
(logand (lognot (ash mask start)) n))))
Rewritten as C (for people unfamiliar with Scheme), it'd look something like this (with the understanding that in Scheme, numbers can be arbitrarily big):
int
bit_reverse(int k, int n)
{
int m = n < 0 ? ~n : n;
int rvs = 0;
while (--k >= 0) {
rvs = (rvs << 1) | (m & 1);
m >>= 1;
}
return n < 0 ? ~rvs : rvs;
}
int
reverse_bit_field(int n, int start, int end)
{
int width = end - start;
int mask = ~(-1 << width);
int zn = mask & (n >> start);
return (bit_reverse(width, zn) << start) | (~(mask << start) & n);
}
Reversing the bits.
For example we have a number represented by 01101011 . Now if we reverse the bits then this number will become 11010110. Now to achieve this you should first know how to do swap two bits in a number.
Swapping two bits in a number:-
XOR both the bits with one and see if results are different. If they are not then both the bits are same otherwise XOR both the bits with XOR and save it in its original number;
Now for reversing the number
FOR I less than Numberofbits/2
swap(Number,I,NumberOfBits-1-I);