initial block delaying in verilog - mips

When implementing a single cycle mips in Verilog. PC is initialized to address 0
then updates its value to PC+1 at the posedge of the clock which was also initialized to 0.
The problem is in simulation, the instruction at address 0 takes only half clock cycle then the PC increments by 4 and then the second instruction enters the processor.
simulation screenshot http://imagizer.imageshack.us/v2/800x600q90/36/0cxn.jpg
Neither initializing clock by 1 nor adding delays before initializing PC solved the problem
this is my clock module
`timescale 1ps / 1ps
module clk_gen( clk );
output reg clk ;
initial begin
clk<=0;
end
always begin
#1400 clk=!clk;
end
endmodule
PC module:
module PC(inPC, Address, clk);
input [31:0] inPC;
input clk;
output reg [31:0] Address;
initial begin
Address=32'd0;
end
always #( posedge clk) begin
Address <= inPC;
end
endmodule

The question does not seem to relate to your clock module but based on that code.
Assuming you are not expect this to be synthesisable. Idealy you defined a reg or logic value in one process.
NB if your simulator supports it I would define times with absolutes ie #1.4ns for your delay.
initial begin
clk <= 1'b0;
forever begin
#1.4ns clk <= ~clk ;
end
end

Related

Memory allocation in Julia function

Here is a simple function in Julia 0.5.
function foo{T<:AbstractFloat}(x::T)
a = zero(T)
b = zero(T)
return x
end
I started with julia --track-allocation=user. then include("test.jl"). test.jl only has this function. Run foo(5.). Then Profile.clear_malloc_data(). foo(5.) again in the REPL. Quit julia. Look at the file test.jl.mem.
- function foo{T<:AbstractFloat}(x::T)
- a = zero(T)
194973 b = zero(T)
0 return x
- end
-
Why is there 194973 bytes of memory allocated here? This is also not the first line of the function. Although after Profile.clear_malloc_data(), this shouldn't matter.
Let's clarify some parts of the relevant documentation, which can be a little misleading:
In interpreting the results, there are a few important details. Under the user setting, the first line of any function directly called from the REPL will exhibit allocation due to events that happen in the REPL code itself.
Indeed, the line with allocation is not the first line. However, it is still the first tracked line, since Julia 0.5 has some issues with tracking allocation on the actual first statement (this has been fixed on v0.6). Note that it may also (contrary to what the documentation says) propagate into functions, even if they are annotated with #noinline. The only real solution is to ensure the first statement of what's being called is something you don't want to measure.
More significantly, JIT-compilation also adds to allocation counts, because much of Julia’s compiler is written in Julia (and compilation usually requires memory allocation). The recommended procedure is to force compilation by executing all the commands you want to analyze, then call Profile.clear_malloc_data() to reset all allocation counters. Finally, execute the desired commands and quit Julia to trigger the generation of the .mem files.
You're right that Profile.clear_malloc_data() prevents the allocation for JIT compilation being counted. However, this paragraph is separate from the first paragraph; clear_malloc_data does not do anything about allocation due to "events that happen in the REPL code itself".
Indeed, as I'm sure you suspected, there is no allocation in this function:
julia> function foo{T<:AbstractFloat}(x::T)
a = zero(T)
b = zero(T)
return x
end
foo (generic function with 1 method)
julia> #allocated foo(5.)
0
The numbers you see are due to events in the REPL itself. To avoid this issue, wrap the code to measure in a function. That is to say, we can use this as our test harness, perhaps after disabling inlining on foo with #noinline. For instance, here's a revised test.jl:
#noinline function foo{T<:AbstractFloat}(x::T)
a = zero(T)
b = zero(T)
return x
end
function do_measurements()
x = 0. # dummy statement
x += foo(5.)
x # prevent foo call being optimized out
# (it won't, but this is good practice)
end
Then a REPL session:
julia> include("test.jl")
do_measurements (generic function with 1 method)
julia> do_measurements()
5.0
julia> Profile.clear_malloc_data()
julia> do_measurements()
5.0
Which produces the expected result:
- #noinline function foo{T<:AbstractFloat}(x::T)
0 a = zero(T)
0 b = zero(T)
0 return x
- end
-
- function do_measurements()
155351 x = 0. # dummy statement
0 x += foo(5.)
0 x # prevent foo call being optimized out
- # (it won't, but this is good practice)
- end
-

FPGA output pins outputting wrong state

I am writing a LCD controller for an FPGA and am having a really weird (for me at least) problem. The state machine that's supposed to output the needed bits to the screen misbehaves and gets the output pins "stuck" in an old state, while it clearly has moved on to later states.
Here is the relevant parts of the state machine:
PROCESS (clk)
VARIABLE count: INTEGER RANGE 0 TO clk_divider; -- clk_divider is a generic positive.
BEGIN
IF (clk'EVENT AND clk = '1') THEN
count := count + 1;
IF (count = clk_divider) THEN
EAUX <= NOT EAUX;
count := 0;
END IF;
END IF;
END PROCESS;
....
PROCESS (EAUX)
BEGIN
IF (EAUX'EVENT AND EAUX = '1') THEN
pr_state <= nx_state;
END IF;
END PROCESS;
....
PROCESS (pr_state)
BEGIN
CASE pr_state IS
WHEN EntryMode => --6=1,7=Cursor increment/decrement, 8=Display shift on/off
RSs <='0';
DB(7 DOWNTO 0) := "00000110";
nx_state <= WriteData;
WHEN WriteData => --Write data to LCD:
RSs <='1';
YLED <= '1';
DB(7 DOWNTO 0) := "01011111";
i := i + 1;
IF (i < chars) THEN
nx_state <= WriteData;
ELSE
i := 0;
nx_state <= ReturnHome;
END IF;
WHEN ReturnHome => --Return cursor
RSs <='0';
YLED <= '1';
DB(7 DOWNTO 0) := "01011111";
nx_state <= WriteData;
END CASE;
END PROCESS;
Where the bits in the variable DB is assigned to the signal DBOUT:
DBOUT : OUT STD_LOGIC_VECTOR(7 DOWNTO 0) -- In entity
SHARED VARIABLE DB : STD_LOGIC_VECTOR(7 DOWNTO 0) := "00000000"; -- In Architecture
DBOUT <= DB;
DBOUT is outputted (in the .ucf-file) as:
NET "DBOUT(0)" LOC = P10;
NET "DBOUT(1)" LOC = P11;
NET "DBOUT(2)" LOC = P12;
NET "DBOUT(3)" LOC = P13;
NET "DBOUT(4)" LOC = P15;
NET "DBOUT(5)" LOC = P16;
NET "DBOUT(6)" LOC = P18;
NET "DBOUT(7)" LOC = P19;
Using an oscilloscope on the pins I can see that it is clearly stuck outputting the "EntryMode" bits and the "RSs" is set at low, while the YLED (the internal led on the FPGA) is on (it's off at all other states). The really weird thing is (and this took a real long time to find) is that if I change the EntryMode bits from
"00000110"
to
"00000100"
it successfully passes the state and outputs the correct bits. It might be true for other changes as well, but I don't really feel like testing that too much. Any help or tips would be highly appreciated!
UPDATE:
After popular request I explicitly put YLED to low in all the early states and switched (back) DB to be a signal. The result is that I can't reach the later states at all, or at least stay in them (even when fiddling with the magic bits, which I guess is a good thing) as the YLED only stays on for a split second after booting the FPGA.
There is a complete example, including theory, state machine, and VHDL code on pages 279-290 of "Finite State Machines in Hardware: Theory and Design...", by Volnei Pedroni, MIT Press, Dec. 2013.

Mathematical operations within function argument

Is it possible to perform mathematical operations within the argument when calling a function?
For example:
answer = to_integer(dividend/divisor);
While Phillipe exaggerates the efficiency of the average VHDL coder, it's not a difficult thing to try.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity foo is
end entity;
architecture fum of foo is
signal dividend: unsigned (7 downto 0) := ("11111111"); -- 255
signal divisor: unsigned (7 downto 0) := ("00001111"); -- 15
signal answer: integer;
begin
process
begin
answer <= to_integer(dividend/divisor);
wait for 0 ns;
report "answer = " & integer'image(answer);
wait;
end process;
end architecture;
The result:
foo.vhdl:17:9:#0ns:(report note): answer = 17
The wait for 0 ns; allows answer to assume the value of the operation (it's a signal, and assignments don't occur when any process is executing or has not yet suspended). For 0 ns will cause a delta cycle delay.
If answer were a variable declared in the process it's value would be available immediately and the wait wouldn't be necessary.
The last wait statement without a delay prevents the process from executing repeatedly.

FFT implemetation in Verilog: Assigning Wire input to Register type array

I am trying to implement butterfly FFT algorithm in verilog.
I create K(Here 4) butterfly modules . I create modules like this.
localparam K = 4;
genvar i;
generate
for(i=0;i<N/2;i=i+1)
begin:BN
butterfly #(
.M_WDTH (3 + 2*1),
.X_WDTH (4)
)
bf (
.clk(clk),
.rst_n(rst_n),
.m_in(min),
.w(w[i]),
.xa(IN[i]),
.xb(IN[i+2]),
.x_nd(x_ndd),
.m_out(mout[i]),
.ya(OUT[i]),
.yb(OUT[i+2]),
.y_nd(y_nddd[i])
);
end
Each level I have to change input Xa and Xb for each Module (Here Number of level 3).
So I try to initialize reg type "IN"array and assign the array to input Xa and Xb. When I initialize "IN" array manually, it works perfectly.
The problem I face now, I couldn't assign Main module input X to register type "IN" array.
Main module input X ,
input wire signed [N*2*X_WDTH-1:0] X,
I have to assign this X into array "IN",
reg signed [2*X_WDTH-1:0] IN [0:N-1];
I assigned like this,
initial
begin
IN[0]= X[2*X_WDTH-1:0];
IN[1]=X[4*X_WDTH-1:2*X_WDTH];
IN[2]=X[6*X_WDTH-1:4*X_WDTH];
IN[3]= X[8*X_WDTH-1:6*X_WDTH];
IN[4]= X[10*X_WDTH-1:8*X_WDTH];
IN[5]=X[12*X_WDTH-1:10*X_WDTH];
IN[6]=X[14*X_WDTH-12*X_WDTH];
IN[7]= X[16*X_WDTH-1:14*X_WDTH];
end
I have gone through many tutorials and forums. No luck.
Can't we assign wire type to reg type array? If so how I can solve this problem.
Here is the Main module where I initialize Butterfly modules,
module Network
#(
// N
parameter N = 8,
// K.
parameter K = 3,
parameter M_WDTH=5,
parameter X_WDTH=4
)
(
input wire clk,
input wire rst_n,
// X
input wire signed [N*2*X_WDTH-1:0] X,
//Y
output wire signed [N*2*X_WDTH-1:0] Y,
output wire signed [K-1:0] y_ndd
);
wire y_nddd [K-1:0];
assign y_ndd ={y_nddd[1],y_nddd[0]};
reg [4:0] min=5'sb11111;
wire [4:0] mout [0:K-1];
reg x_ndd;
reg [2:0] count=3'b100;
reg [2*X_WDTH-1:0] w [K-1:0];
reg [2*X_WDTH-1:0] IN [0:N-1];
wire [2*X_WDTH-1:0] OUT [0:N-1];
assign Y = {OUT[3],OUT[2],OUT[1],OUT[0]};
reg [3:0] a;
initial
begin
//TODO : Here is the problem. Assigning Wire to reg array. Synthesize ok. In Simulate "red" output.
IN[0]= X[2*X_WDTH-1:0];
IN[1]=X[4*X_WDTH-1:2*X_WDTH];
IN[2]=X[6*X_WDTH-1:4*X_WDTH];
IN[3]= X[8*X_WDTH-1:6*X_WDTH];
IN[4]= X[10*X_WDTH-1:8*X_WDTH];
IN[5]=X[12*X_WDTH-1:10*X_WDTH];
IN[6]=X[14*X_WDTH-12*X_WDTH];
IN[7]= X[16*X_WDTH-1:14*X_WDTH];
//TODO :This is only a random values
w[0]=8'sb01000100;
w[1]=8'sb01000100;
w[2]=8'sb01000100;
w[3]=8'sb01000100;
end
/* levels */
genvar i;
generate
for(i=0;i<N/2;i=i+1)
begin:BN
butterfly #(
.M_WDTH (3 + 2*1),
.X_WDTH (4)
)
bf (
.clk(clk),
.rst_n(rst_n),
.m_in(min),
.w(w[i]),
.xa(IN[i]),
.xb(IN[i+N/2]),
.x_nd(x_ndd),
.m_out(mout[i]),
.ya(OUT[2*i]),
.yb(OUT[2*i+1]),
.y_nd(y_nddd[i])
);
end
endgenerate
always # (posedge clk)
begin
if (count==3'b100)
begin
count=3'b001;
x_ndd=1;
end
else
begin
count=count+1;
x_ndd=0;
end
end
always# (posedge y_ndd[0])
begin
//TODO
//Here I have to swap OUT-->IN
end
endmodule
Any help is appreciated.
Thanks in advance.
"Output is red", this likely means it is x this could be due to multiple drivers or an uninitialized value. If it was un-driven it would be z.
The main Issue I believe is that you do this :
initial begin
IN[0] = X[2*X_WDTH-1:0];
IN[1] = X[4*X_WDTH-1:2*X_WDTH];
...
The important part is the initial This is only evaluated once, at time 0. Generally everything is x at time zero. To make this an equivalent of the assign IN[0] = ... for a wire use always #* begin this is a combinatorial block which will update the values for IN when ever X changes.
always #* begin
IN[0] = X[2*X_WDTH-1:0];
IN[1] = X[4*X_WDTH-1:2*X_WDTH];
...
I am not sure why you do not just connect your X to your butterfly .xa and .xb ports directly though?
Other pointers
X is a bad variable name verilog as a wire or reg can hold four values 1,0,x or z.
In always #(posedge clk) you should be using non-blocking (<=) assignments to correctly model the behaviour of a flip-flop.
y_ndd is k bits wide but only the first 2 bits are assigned.
output signed [K-1:0] y_ndd
assign y_ndd = {y_nddd[1],y_nddd[0]};
Assignments should be in terms of their parameter width/size. For example IN has N entries but currently exactly 8 entries are assigned. There will been an issue when N!=8. Look into Indexing vectors and arrays with +:. Example:
integer idx;
always #* begin
for (idx=0; idx<N; idx=idx+1)
IN[idx] = X[ idx*2*X_WDTH +: 2*X_WDTH];
end
genvar gidx;
generate
for(gidx=0; gidx<N; gidx=gidx+1) begin
assign Y[ gidx*2*X_WDTH +: 2*X_WDTH] = OUT[gidx];
end
endgenerate

MIPS data path for store word?

Based on this figure, executing the SW instruction would cause these values to be assigned to the signals labeled in blue:
RegWrite = 0
ALUSrc = 1
ALU operation = 0010
MemRead = 0
MemWrite = 1
MemtoReg = X
PCSrc =
However, I am a little confused which inputs will be used in the Registers block? Can anyone describe the overall SW procedure in the MIPS datapath?
The execution of sw would follow the following steps in your diagram:
Instruction is read and decoded from the PC in the Instruction Memory subcircuit.
The register file is read for $rs and $rt (Registers subcircuit)
The value of $rs is added to the sign extended immediate (selected by ALUSrc) (ALU subcircuit).
The added value and $rt are passed to the Data Memory subcircuit where the value of $rt is written to memory.