Why are my Verilog output registers only outputting "x"? - output

I've written the following module to take in a 12-bit stream of input, and run it through the Exponential Moving Average (EMA) formula. When simulating the program, it appears as if my output is a 12-bit stream of "don't cares". Regardless of what my input is, it isn't realized by the output register.
I'm a beginner at Verilog, so I'm sure there are a slew of issues here, but I expected to at least get some kind of output. Why is this behavior occurring and how can I fix it from occurring in the future? I've linked a screenshot of my waveform results at the bottom of this post.
My verilog module:
`timescale 1ns / 1ps
module EMATesting ( input signed [11:0] signal_in,
output reg signed [11:0] signal_out,
input clock_in,
input reset,
input enable,
output reg [11:0] newEMA);
reg [11:0] prevEMA;
reg [11:0] y;
reg [11:0] temp;
integer count = 0;
integer t = 1;
integer one = 1;
integer alpha = 0.5;
always #(posedge clock_in) begin
if (reset) begin
count = 0;
end
else if (count < 64) begin
count = count + 1;
end
//set the output equal to the first value received
if (count == 1) begin
newEMA = signal_in;
end
//if not the first value, run through the formula
else begin
temp = newEMA;
prevEMA = temp;
newEMA = alpha * signal_in + (one-alpha) * prevEMA;
count = count + 1;
end
end
endmodule
My verilog testbench:
`timescale 10ns / 1ns
module testbench_EMA;
reg signal_in;
reg clock_in, reset, enable;
wire [11:0] signal_out;
wire [11:0] newEMA;
initial begin
signal_in = 0; reset = 0; enable = 0;
clock_in = 0;
end
EMATesting DUT(signal_in, signal_out, clock_in, reset, enable, newEMA);
initial
begin
//test some values with timing intervals here
#100;
reset = 1;
#100;
reset = 0;
enable = 1;
#100;
//set signal_in to "1" repeating 12 times
signal_in = { 12 { 1'b1}};
#100;
reset = 1;
#100; //buffer to end simulation
$finish;
end
always
#5 clock_in = !clock_in;
endmodule
Waveform results of my testbench. Note that both output registers are giving no values.

There are many mistakes in your code. #Mr. Snrub listed some of them. You can correct the mistakes that are listed in the comments, and by further editing your code you can make it work as you wish, but there are more important fundamental problems. You have to solve them before you make further and hard to debug mistakes in the future, it also gives advantage of understanding HDL concepts better.
First of all, you need to understand the difference between software programming languages and hardware description languages. You should make further research for that, the mistakes you made are commonly made by developers who assume HDL is similar to software programming languages. Check this link for further information.
You need to further understand how the Verilog HDL (or any other HDL) codes are synthesized and why they are used.
Need to be familiar when and how to use blocking = or non-blocking <= assignments and how they work. Check this link.
When you research about the listed topics, you will understand that in edge sensitive (synchronous or sequential) always block, it is bad idea to use blocking = assignment.
When you use blocking assignment, the assignments are done in series, as in common software programming languages. In the case of edge sensitive always block, you have very limited time to finish all required assignments inside the always block.
In your case, as all the assignments inside the edge sensitive always block are blocking (assuming no reset condition):
First, it checks if count is less than 64, and completes the assignment if that is the case. All other assignments are waiting for that assignment to finish.
By the time the first assignment is done, the always block with posedge clock_in sensitivity might have ended execution and waiting for the next edge of the clock. So, other assignments are not done.
As you have not initialized your output registers, they initially were filled with 'don't cares' and new value is never assigned again, so, as a result, you got that output (filled with 'don't cares').
Even when count becomes larger than or equal to 64, you have other blocking assignments before you assign value to newEMA.

Related

Is using base case variable in a recursive function important?

I'm currently learning about recursion, it's pretty hard to understand. I found a very common example for it:
function factorial(N)
local Value
if N == 0 then
Value = 1
else
Value = N * factorial(N - 1)
end
return Value
end
print(factorial(3))
N == 0 is the base case. But when i changed it into N == 1, the result is still remains the same. (it will print 6).
Is using the base case important? (will it break or something?)
What's the difference between using N == 0 (base case) and N == 1?
That's just a coincidence, since 1 * 1 = 1, so it ends up working either way.
But consider the edge-case where N = 0, if you check for N == 1, then you'd go into the else branch and calculate 0 * factorial(-1), which would lead to an endless loop.
The same would happen in both cases if you just called factorial(-1) directly, which is why you should either check for > 0 instead (effectively treating every negative value as 0 and returning 1, or add another if condition and raise an error when N is negative.
EDIT: As pointed out in another answer, your implementation is not tail-recursive, meaning it accumulates memory for every recursive functioncall until it finishes or runs out of memory.
You can make the function tail-recursive, which allows Lua to treat it pretty much like a normal loop that could run as long as it takes to calculate its result:
local function factorial(n, acc)
acc = acc or 1
if n <= 0 then
return acc
else
return factorial(n-1, acc*n)
end
return Value
end
print(factorial(3))
Note though, that in the case of factorial, it would take you way longer to run out of stack memory than to overflow Luas number data type at around 21!, so making it tail-recursive is really just a matter of training yourself to write better code.
As the above answer and comments have pointed out, it is essential to have a base-case in a recursive function; otherwise, one ends up with an infinite loop.
Also, in the case of your factorial function, it is probably more efficient to use a helper function to perform the recursion, so as to take advantage of Lua's tail-call optimizations. Since Lua conveniently allows for local functions, you can define a helper within the scope of your factorial function.
Note that this example is not meant to handle the factorials of negative numbers.
-- Requires: n is an integer greater than or equal to 0.
-- Effects : returns the factorial of n.
function fact(n)
-- Local function that will actually perform the recursion.
local function fact_helper(n, i)
-- This is the base case.
if (i == 1) then
return n
end
-- Take advantage of tail calls.
return fact_helper(n * i, i - 1)
end
-- Check for edge cases, such as fact(0) and fact(1).
if ((n == 0) or (n == 1)) then
return 1
end
return fact_helper(n, n - 1)
end

Octave keeps giving results from function although not asked

I created a function in Octave for which I, at this moment, only want one of the possible outputs displayed. The code:
function [pi, time, numiter] = PageRank(pi0,H,v,n,alpha,epsilon);
rowsumvector=ones(1,n)*H';
nonzerorows=find(rowsumvector);
zerorows=setdiff(1:n,nonzerorows); l=length(zerorows);
a=sparse(zerorows,ones(l,1),ones(l,1),n,1);
k=0;
residual=1;
pi=pi0;
tic;
while (residual >= epsilon)
prevpi=pi;
k=k+1;
pi=alpha*pi*H + (alpha*(pi*a)+1-alpha)*v;
residual = norm(pi-prevpi,1);
end
pi;
numiter=k
time=toc;
endfunction
Now I only want numiter returned, but it keeps giving me back pi as well, no matter whether I delete pi;, or not.
It returns it in the following format:
>> PageRank(pi0,H,v,length(H),0.9,epsilon)
numiter = 32
ans =
0.026867 0.157753 0.026867 0.133573 0.315385
To me it seems strange that the pi is not given with its variable, but merely as an ans.
Any suggestions?
I know the Octave documentation for this is not very extensive, but perhaps it gives enough hints to understand that how you think about output variables is wrong.
The call
PageRank(pi0,H,v,length(H),0.9,epsilon)
returns a single output argument, it is equivalent to
ans = PageRank(pi0,H,v,length(H),0.9,epsilon)
ans is always the implied output argument if none is explicitly given. ans will be assigned the value of pi, the first output argument of your function. The variable pi (nor time, nor numiter) in your workspace will be modified or assigned to. These are the names of local variables inside your function.
To obtain other output variables, do this:
[out1,out2,out3] = PageRank(pi0,H,v,length(H),0.9,epsilon)
Now, the variable out1 will be assigned the value that pi had inside your function. out2 will contain the value of time, and out3 the value of numiter,
If you don't want the first two output arguments, and only want the third one, do this:
[~,~,out3] = PageRank(pi0,H,v,length(H),0.9,epsilon)
The ~ indicates to Octave that you want to ignore that output argument.

How to optimize finding values in 2D array in verilog

I need to set up a function that determines if a match exists in a 2D array (index).
My current implementation works, but is creating a large chain of LUTs due to if statements checking each element of the array.
function result_type index_search ( index_type index, logic[7:0] address );
for ( int i=0; i < 8; i++ ) begin
if ( index[i] == address ) begin
result = i;
end
end
Is there a way to check for matches in a more efficient manner?
Not much to be done, really; at least for the code in hand. Since your code targets hardware, to optimize it think in terms of hardware, not function/verilog code.
For a general purpose implementation, without any known data patterns, you'll definitely need (a) N W-bit equality checks, plus (b) an N:1 FPA (Fixed Priority Arbiter, aka priority encoder, aka leading zero detector) that returns the first match, assuming N W-bit inputs. Something like this:
Not much optimization to be done, but here are some possible general-purpose optimizations:
Pipelining, as shown in the figure, if timing is an issue.
Consider an alternative FPA implementation that makes use of 2's complement characteristics and may result to a more LUT-efficient implementation: assign fpa_out = fpa_in & ((~fpa_in)+1); (result in one-hot encoding, not weighted binary, as in your code)
Sticking to one-hot encoding can come in handy and reduce some of your logic down your path, but I cannot say for sure until we see some more code.
This is what the implementation would look like:
logic[N-1:0] addr_eq_idx;
logic[N-1:0] result;
for (genvar i=0; i<N; i++) begin: g_eq_N
// multiple matches may exist in this vector
assign addr_eq_idx[i] = (address == index[i]) ? 1'b1 : 1'b0;
// pipelined version:
// always_ff #(posedge clk, negedge arstn)
// if (!arstn)
// addr_eq_idx[i] <= 1'b0;
// else
// addr_eq_idx[i] <= (address == index[i]) ? 1'b1 : 1'b0;
end
// result has a '1' at the position where the first match is found
assign result = addr_eq_idx & ((~addr_eq_idx) + 1);
Finally, try to think if your design can be simplified due to known run-time data characteristics. For example, let's say you are 100% sure that the address you're looking for may exist within the index 2D array in at most one position. If that is the case, then you do not need an FPA at all, since the first match will be the only match. In that case, addr_eq_idx already points to the matching index, as a one-hot vector.

Composite trapezoid rule not running in Octave

I have the following code in Octave for implementing the composite trapezoid rule and for some reason the function only stalls whenever I execute it in Octave on f = #(x) x^2, a = 0, b = 4, TOL = 10^-6. Whenever I call trapezoid(f, a, b, TOL), nothing happens and I have to exit the Terminal in order to do anything else in Octave. Here is the code:
% INPUTS
%
% f : a function
% a : starting point
% b : endpoint
% TOL : tolerance
function root = trapezoid(f, a, b, TOL)
disp('test');
max_iterations = 10000;
disp(max_iterations);
count = 1;
disp(count);
initial = (b-a)*(f(b) + f(a))/2;
while count < max_iterations
disp(initial);
trap_0 = initial;
trap_1 = 0;
trap_1_midpoints = a:(0.5^count):b;
for i = 1:(length(trap_1_midpoints)-1)
trap_1 = trap_1 + (trap_1_midpoints(i+1) - trap_1_midpoints(i))*(f(trap_1_midpoints(i+1) + f(trap_1_midpoints(i))))/2;
endfor
if abs(trap_0 - trap_1) < TOL
root = trap_1;
return;
endif
intial = trap_1;
count = count + 1;
disp(count);
endwhile
disp(['Process ended after ' num2str(max_iterations), ' iterations.']);
I have tried your function in Matlab.
Your code is not stalling. It is rather that the size of trap_1_midpoints increases exponentionaly. With that the computation time of trap_1 increases also exponentionaly. This is what you experience as stalling.
I also found a possible bug in your code. I guess the line after the if clause should be initial = trap_1. Check the missing 'i'.
With that, your code still takes forever, but if you increase the tolerance (e.g. to a value of 1) your code return.
You could try to vectorize the for loop for speed up.
Edit: I think inside your for loop, a ) is missing after f(trap_1_midpoints(i+1).
After count=52 or so, the arithmetic sequence trap_1_midpoints is no longer representable in any meaningful fashion in floating point numbers. After count=1075 or similar, the step size is no longer representable as a positive floating point double number. That all is to say, the bound max_iterations = 10000 is ludicrous. As explained below, all computations after count=20 are meaningless.
The theoretical error for stepsize h is O(T·h^2). There is a numerical error accumulation in the summation of O(T/h) numbers that is of that size, i.e., O(mu/h) with mu=1ulp=2^(-52). Which in total means that the lowest error of the numerical integration can be expected around h=mu^(1/3), for double numbers thus h=1e-5 or in the algorithm count=17. This may vary with interval length and how smooth or wavy the function is.
One can expect the behavior that the error divides by four while halving the step size only for step sizes above this boundary 1e-5. This also means that abs(trap_0 - trap_1) is a reliable measure for the error of trap_0 (and abs(trap_0 - trap_1)/3 for trap_1) only inside this range of step sizes.
The error bound TOL=1e-6 should be met for about h=1e-3, which corresponds to count=10. If the recursion does not stop for count = 14 (which should give an error smaller than 1e-8) then the method is not accurately implemented.

Matlab Newbie Binary Search Troubleshoot

I am a newbie to Matlab/programming in general. I wish to write a program/script that uses recursive binary search to approximate the root of $2x - 3sin(x)+5=0$, such that the iteration terminates once the truncation error is definitely $< 0.5 \times 10 ^{-5}$ and print out the number of iterations as well as the estimate of the root.
Here is my attempt that seems to have broken my computer...
%Approximating the root of f(x) = 2*x - 3*sin(x) + 5 by binary search
%Define variables
low = input('Enter lower bound of range: ');
high = input('Enter upper bound of range: ');
mid = (low + high)/2;
%Define f_low & f_high
f_low = 2*low - 3*sin(low) + 5;
f_high = 2*high - 3*sin(high) + 5;
f_mid = 2*mid - 3*sin(mid) + 5;
%Check that the entered range contains the key
while (f_low * f_high) > 0 || low > high
disp('Invalid range')
low = input('Enter lower bound of range: ');
high = input('Enter upper bound of range: ');
end
%The new range
while abs(f_mid) > 0.5*10^(-5)
if f_mid < 0
low = mid;
elseif f_mid > 0
high = mid;
end
end
fprintf('mid = %.4f \n', mid)
I haven't even added in the number-of-iterations counting bit (which I am not quite sure how to do) and already I am stuck.
Thanks for any help.
Once you set high=mid or low=mid, is mid and f_mid recalculated? It looks like you will fail if f_low>0 and f_high<0. This is a valid condition, but you are choosing the wrong one to reset in this case. Also, your termination check is on the function value, not the difference between low and high. This may be what you want, or maybe you want to check both ways. For very flat functions you may not be able to get the function value that small.
You don't need f_mid, and is in fact misleading you. You just need to calculate the value at each step, and see which direction to go.
Plus, you are just changing low and high, but you do not evaluate again f_low or f_high. Matlab is not an algebra system (there are modules for symbolic computation, but that's a different story), so you did not define f_low and f_high to change with the change of low and high: you have to reevaluate them in your final loop.