Xilinx VHDL latch warning troubleshooting - warnings

Xilinx is inferring a latch for a VHDL code i've written. I've looked up the possible causes for this and found that it's often due to incomplete if or case statements. I've gone through and made sure to include else and when others statements, but i'm still receiving the warning. I believe this is also affecting another project i'm working on so i'd like to understand why this is the case.
library IEEE;
use IEEE.STD_LOGIC_1164.ALL;
entity state_machine is
port(trig, en: in std_logic; cstate,nstate: out std_logic_vector(0 to 2));
end state_machine;
architecture Behavioral of state_machine is
signal cstate_s,nstate_s: std_logic_vector(0 to 2);
begin
cstate <= cstate_s;
nstate <= nstate_s;
process(en, cstate_s)
begin
if en = '1' then
nstate_s <= "111";
if cstate_s = "111" then
nstate_s <= "011";
elsif cstate_s = "011" then
nstate_s <= "100";
elsif cstate_s = "100" then
nstate_s <= "101";
elsif cstate_s = "101" then
nstate_s <= "110";
elsif cstate_s = "110" then
nstate_s <= "111";
else
null;
end if;
else
null;
end if;
end process;
process(trig, nstate_s)
begin
if rising_edge(trig) then
cstate_s <= nstate_s;
else
null;
end if;
end process;
end Behavioral;
WARNING:Xst:737 - Found 3-bit latch for signal . Latches may
be generated from incomplete case or if statements. We do not
recommend the use of latches in FPGA/CPLD designs, as they may lead to
timing problems.

For there to be no latches synthesised when a combinational process is synthesised, there must be no path between begin and end process; where all the outputs of the process are not assigned. This is called complete assignment. An output of the process is any signal assigned anywhere within it.
You have such paths. When any path with your null statements are executed, the output of your first process (nstate_s) is not assigned to. Therefore, you will get latches synthesised. There is no point in just having a null statement. If you genuinely don't care what value is assigned to your outputs in these paths, assign the outputs to '-', which means don't care in VHDL.
By the way (assuming trig is a clock), your second process is not combinational (it is sequential) and so you don't need to obey complete assignment; your else branch is unnecessary.

Related

FPGA output pins outputting wrong state

I am writing a LCD controller for an FPGA and am having a really weird (for me at least) problem. The state machine that's supposed to output the needed bits to the screen misbehaves and gets the output pins "stuck" in an old state, while it clearly has moved on to later states.
Here is the relevant parts of the state machine:
PROCESS (clk)
VARIABLE count: INTEGER RANGE 0 TO clk_divider; -- clk_divider is a generic positive.
BEGIN
IF (clk'EVENT AND clk = '1') THEN
count := count + 1;
IF (count = clk_divider) THEN
EAUX <= NOT EAUX;
count := 0;
END IF;
END IF;
END PROCESS;
....
PROCESS (EAUX)
BEGIN
IF (EAUX'EVENT AND EAUX = '1') THEN
pr_state <= nx_state;
END IF;
END PROCESS;
....
PROCESS (pr_state)
BEGIN
CASE pr_state IS
WHEN EntryMode => --6=1,7=Cursor increment/decrement, 8=Display shift on/off
RSs <='0';
DB(7 DOWNTO 0) := "00000110";
nx_state <= WriteData;
WHEN WriteData => --Write data to LCD:
RSs <='1';
YLED <= '1';
DB(7 DOWNTO 0) := "01011111";
i := i + 1;
IF (i < chars) THEN
nx_state <= WriteData;
ELSE
i := 0;
nx_state <= ReturnHome;
END IF;
WHEN ReturnHome => --Return cursor
RSs <='0';
YLED <= '1';
DB(7 DOWNTO 0) := "01011111";
nx_state <= WriteData;
END CASE;
END PROCESS;
Where the bits in the variable DB is assigned to the signal DBOUT:
DBOUT : OUT STD_LOGIC_VECTOR(7 DOWNTO 0) -- In entity
SHARED VARIABLE DB : STD_LOGIC_VECTOR(7 DOWNTO 0) := "00000000"; -- In Architecture
DBOUT <= DB;
DBOUT is outputted (in the .ucf-file) as:
NET "DBOUT(0)" LOC = P10;
NET "DBOUT(1)" LOC = P11;
NET "DBOUT(2)" LOC = P12;
NET "DBOUT(3)" LOC = P13;
NET "DBOUT(4)" LOC = P15;
NET "DBOUT(5)" LOC = P16;
NET "DBOUT(6)" LOC = P18;
NET "DBOUT(7)" LOC = P19;
Using an oscilloscope on the pins I can see that it is clearly stuck outputting the "EntryMode" bits and the "RSs" is set at low, while the YLED (the internal led on the FPGA) is on (it's off at all other states). The really weird thing is (and this took a real long time to find) is that if I change the EntryMode bits from
"00000110"
to
"00000100"
it successfully passes the state and outputs the correct bits. It might be true for other changes as well, but I don't really feel like testing that too much. Any help or tips would be highly appreciated!
UPDATE:
After popular request I explicitly put YLED to low in all the early states and switched (back) DB to be a signal. The result is that I can't reach the later states at all, or at least stay in them (even when fiddling with the magic bits, which I guess is a good thing) as the YLED only stays on for a split second after booting the FPGA.
There is a complete example, including theory, state machine, and VHDL code on pages 279-290 of "Finite State Machines in Hardware: Theory and Design...", by Volnei Pedroni, MIT Press, Dec. 2013.

Mathematical operations within function argument

Is it possible to perform mathematical operations within the argument when calling a function?
For example:
answer = to_integer(dividend/divisor);
While Phillipe exaggerates the efficiency of the average VHDL coder, it's not a difficult thing to try.
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
entity foo is
end entity;
architecture fum of foo is
signal dividend: unsigned (7 downto 0) := ("11111111"); -- 255
signal divisor: unsigned (7 downto 0) := ("00001111"); -- 15
signal answer: integer;
begin
process
begin
answer <= to_integer(dividend/divisor);
wait for 0 ns;
report "answer = " & integer'image(answer);
wait;
end process;
end architecture;
The result:
foo.vhdl:17:9:#0ns:(report note): answer = 17
The wait for 0 ns; allows answer to assume the value of the operation (it's a signal, and assignments don't occur when any process is executing or has not yet suspended). For 0 ns will cause a delta cycle delay.
If answer were a variable declared in the process it's value would be available immediately and the wait wouldn't be necessary.
The last wait statement without a delay prevents the process from executing repeatedly.

Getting warning error in vhdl code

I keep getting a strange error in my code, it compiles fine but I keep getting a warning:
Warning: Unconnected, internal signal \s(0)D\ is promoted to input PIN.
Warning: Unconnected, internal signal \s(1)D\ is promoted to input PIN.
The code is for a basic register which resets, shifts to the left and inserts S_IN, and loads the values set into the register using Pload. Can anyone help me figure out what is wrong with it?
library IEEE;
use ieee.std_logic_1164.all;
entity special_register is
port( DATA: in std_logic_vector(3 downto 0);
Reset: in std_logic;
PLoad: in std_logic;
S_Right: in std_logic;
S_IN: in std_logic;
clock : in std_logic;
S: in std_logic_vector(1 downto 0);
D: out std_logic_vector(3 downto 0);
Q : out std_logic_vector(3 downto 0));
end special_register;
architecture behav of special_register is
begin
process(clock, data, Reset, S_IN, S, S_Right, PLoad)
begin
if rising_edge(clock) then
S(0) <= (S_Right);
S(1) <= (PLoad);
if (S(1) = '1') then
D(3) <= DATA(3);
D(2) <= DATA(2);
D(1) <= DATA(1);
D(0) <= DATA(0);
else if (S(0) = '0') then
D(0) <= Q(1);
D(1) <= Q(2);
D(2) <= Q(3);
D(3) <= S_IN;
end if;
end if;
end if;
end process;
Q(3) <= (NOT Reset) AND D(3);
Q(2) <= (NOT Reset) AND D(2);
Q(1) <= (NOT Reset) AND D(1);
Q(0) <= (NOT Reset) AND D(0);
end behav;
In addition to Brian's answer you could note that this design specification is not VHDL compliant and the presumed synthesis tool isn't either:
ghdl -a special_register.vhdl
special_register.vhdl:21:2: port "s" can't be assigned
special_register.vhdl:22:2: port "s" can't be assigned
special_register.vhdl:29:14: port "q" cannot be read
special_register.vhdl:30:14: port "q" cannot be read
special_register.vhdl:31:14: port "q" cannot be read
special_register.vhdl:37:26: port "d" cannot be read
special_register.vhdl:38:26: port "d" cannot be read
special_register.vhdl:39:26: port "d" cannot be read
special_register.vhdl:40:26: port "d" cannot be read
ghdl: compilation error
(It would seem you are trying to read an output port as well).

"Cannot use function in a procedure call" compiler error

Recursion towers of Hanoi program in ADA.
So far I think I have most of it down, my problem is being in my solve function.
I think I have the algorithm fine, but I am not sure how to implement it into the function, all examples I see of using this are using the function inside itself such as:
Example
My errors are:
hanoi.adb:23:09: cannot use function "solve" in a procedure call
hanoi.adb:27:09: cannot use function "solve" in a procedure call
hanoi.adb:59:15: missing ")"
Here is my code so far.
with ada.text_io, ada.command_line;
use ada.text_io, ada.command_line;
procedure hanoi is
Argument_Error : EXCEPTION;
max_disks, min_disks : integer := 3;
moves : integer := 0;
verbose_bool : boolean;
function solve (N: in integer; from, to, using: in character) return integer is
begin
if N = 1 then
if verbose_bool = true then
put("Move disk " & integer'image(N) & " from " & character'image(from) & " to " & character'image(to));
end if;
else
solve(N - 1, 'A', 'B', 'C');
if verbose_bool = true then
put("Move disk " & integer'image(N) & " from " & character'image(from) & " to " & character'image(to));
end if;
solve(N - 1, 'B', 'C', 'A');
end if;
moves := (2 ** min_disks) - 1;
return moves;
end solve;
begin
while min_disks /= max_disks loop
IF Argument_Count > 1 THEN
if Argument_Count = 1 then
min_disks := integer'value("Argument(1)");
elsif Argument_Count = 2 then
min_disks := integer'value("Argument(1)");
max_disks := integer'value("Argument(2)");
elsif Argument_Count = 3 then
min_disks := integer'value("Argument(1)");
max_disks := integer'value("Argument(2)");
if argument(3) = "v" or argument(3) = "V" then
verbose_bool := true; -- if argument is V or v it is true
end if;
END IF;
END IF;
IF Argument_Count > 3 THEN
RAISE argument_error;
END IF;
if (max_disks > 0) then
solve (N: integer; from, to, using : character);
END IF;
min_disks := min_disks + 1;
end loop;
EXCEPTION
WHEN Name_Error =>
Put_Line("Please re-enter your arguments, check to see if you entered integers and characters. Max of 3 arguments.");
WHEN OTHERS =>
Put_Line("Please try to not break the program again, thank you.");
end hanoi;
Functions return values, procedures do not, and you've defined Solve as a function.
Ada requires that you do something with a function's returned value, which you're not doing here. (You can't ignore the returned result as is done in other programming languages.)
As the error message states, your syntax is that of making a procedure call, i.e. invoking a procedure, but you've supplied the name of a function.
If the value being returned from a function is meaningful, then act on it in accordance with its purpose. If it is not providing any meaningful functionality, eliminate it and define Solve as a procedure.
As an aside, you may want to re-factor your display code into a nested subprogram. In the outline below, procedure Print can access the parameters of procedure Solve.
procedure Solve (N: in Integer; From, To, Using: in Character) is
procedure Print is
begin
if Verbose then
...
end if;
end Print;
begin
if N = 1 then
Print;
else
Solve (N - 1, 'A', 'B', 'C');
Print;
Solve (N - 1, 'B', 'C', 'A');
end if;
end Solve;
In addition to Marc's comment about the call to Solve's not being a proper Ada function reference, the syntax you have is that of a specification and not that of a invocation of Solve. You had it right in Solve's body just not in the initial invocation:
if (max_disks > 0) then
solve (N: integer; from, to, using : character);
END IF;

Can I parallelize my program?

This is my program:
program test
implicit none
integer n,m,k,i,j,Errorflag
real :: Yabs(39,39),angle(39,39)
real ,dimension(67,1) :: deltaA,A
real :: V(1,39),d(1,39),v1(29,1),d1(38,1),Ps(1,38),Qs(1,39),Jac(67,67),invJac(67,67)
real :: B1(1,38),B2(1,29),MF(1,67),trnsMF(67,1),P0(1,39),Q0(1,39)
real, dimension(38,38) :: dia1,offdia1,J1
real, dimension(29,29) :: dia2,dia3,dia4,offdia4,J4
real,dimension(38,29) ::offdia2,J2
real,dimension(29,38) ::offdia3,J3
real p,p1,q,q1
n=39;m=9
MF(1,1)=10
open(unit=3,file="ybus.dat",status="old")
open(unit=4,file="angle.dat",status="old")
do i=1,39
read(3,*) Yabs(i,1:39)
read(4,*)angle(i,1:39)
end do
close(3)
close(4)
open(unit=5,file="activepower.dat",status="old")
open(unit=8,file="reactivepower.dat",status="old")
read(5,*)Ps(1,1:38)
read(8,*)Qs(1,1:29)
close(5)
close(8)
do i=1,67
deltaA(i,1)=0
end do
v1(1:29,1)=1
d1(1:38,1)=0
A(1:38,1)=d1(1:38,1)
A(39:67,1)=v1(1:29,1)
!call cpu_time(t1)
do while(maxval(abs(MF))>0.0001)
V(1,1)=0.982
V(1,2:30)=v1(1:29,1)
V(1,31)=1.03
V(1,32)=0.9831
V(1,33)=1.0123
V(1,34)=0.9972
V(1,35)=1.0493
V(1,36)=1.0635
V(1,37)=1.0278
V(1,38)=1.0265
V(1,39)=1.0475
d(1,1)=0
d(1,2:39)=d1(1:38,1)
! % % % %------Active Power Calculation-----%
p1=0;p=0
do i=2,n
do j=1,n
p1=(V(i)*V(j)*Yabs(i,j)*cos(angle(i,j)-d(i)+d(j)))
p=p1+p
end do
P0(i-1)=p
p=0
end do
! % % % %------Reactive Power Calculation-----%
p=0;p1=0
do i=2,(n-m)
do j=1,n
p1=-(V(i)*V(j)*Yabs(i,j)*sin(angle(i,j)-d(i)+d(j)))
p=p1+p
end do
Q0(i-1)=p
p=0
end do
!!!!!!!!!!!mismatch factor
do i=1,(n-1)
B1(i)=Ps(i)-P0(i)
end do
do i=1,(n-m-1)
B2(i)=Qs(i)-Q0(i)
end do
MF(1,1:38)=B1(1,1:38)
MF(1,39:67)=B2(1,1:29)
!!!!!!!!jacobian calculation for preddictor step
!!!!!!!!!!!!!!!!!!!!!!dia of j1
p=0;p1=0
do i=2,n
do j=1,n
if(j .ne. i)then
p1=V(i)*V(j)*Yabs(i,j)*sin(angle(i,j)-d(i)+d(j))
!print*,p1
p=p1+p
end if
end do
i=i-1
dia1(i,i)=p
p=0
i=i+1
end do
!!!!!!!!!!!!!!off dia. of j1
q=0;q1=0;
do k=2,n
i=k
do j=2,n
if(j .ne. i)then
q1=V(i)*V(j)*Yabs(i,j)*sin(angle(i,j)-d(i)+d(j))
end if
i=i-1;j=j-1
offdia1(i,j)=-q1
q1=0
i=i+1;j=j+1
end do
end do
do i=1,38
do j=1,38
J1(i,j)=offdia1(i,j)+dia1(i,j)
end do
end do
!!!!!!!!!!!!!!!!!!!dia. of j2
p=0;p1=0
do i=2,(n-m)
do j=1,n
if(j .ne. i)then
p1=V(j)*Yabs(i,j)*cos(angle(i,j)-d(i)+d(j))
p=p1+p
end if
end do
dia2(i-1,i-1)=p+(2*V(i)*Yabs(i,i)*cos(angle(i,i)))
p=0;
end do
!!!!!!!!!!!!!!!!!!off dia. of j2
p1=0;
do k=2,n
i=k
do j=2,(n-m)
if(j .ne. i)then
p1=V(i)*Yabs(i,j)*cos(angle(i,j)-d(i)+d(j));
end if
i=i-1;j=j-1
offdia2(i,j)=p1
p1=0;
i=i+1;j=j+1
end do
end do
do i=1,(n-m-1)
offdia2(i,i)=dia2(i,i)
end do
J2=offdia2
!!!!!!!!!!!!!!!!!!!!dia. of j3
p=0;p1=0
do i=2,(n-m)
do j=1,n
if(j .ne. i)then
p1=V(i)*V(j)*Yabs(i,j)*cos(angle(i,j)-d(i)+d(j))
p=p1+p;
end if
end do
i=i-1;
dia3(i,i)=p
p=0;
i=i+1;
end do
!!!!!!!!!!!!!!off dia of j3
p=0;p1=0
do k=2,(n-m)
i=k;
do j=2,n
if(j .ne. i)then
p1=V(i)*V(j)*Yabs(i,j)*cos(angle(i,j)-d(i)+d(j))
end if
i=i-1;j=j-1
offdia3(i,j)=-p1;
p1=0;
i=i+1;j=j+1
end do
end do
do i=1,(n-m-1)
offdia3(i,i)=dia3(i,i)
end do
J3=offdia3
!!!!!!!!!!dia of j4
p=0;p1=0
do i=2,(n-m)
do j=1,n
if(j .ne. i)then
p1=V(j)*Yabs(i,j)*sin(angle(i,j)-d(i)+d(j))
p=p1+p
end if
end do
dia4(i-1,i-1)=-(2*V(i)*Yabs(i,i)*sin(angle(i,i)))-p
p=0;p1=0
end do
!!!!!!!!!!!!!!!off dia of j4
p1=0;p=0
do k=2,(n-m)
i=k;
do j=2,(n-m)
if(j .ne. i)then
p1=V(i)*Yabs(i,j)*sin(angle(i,j)-d(i)+d(j))
end if
i=i-1;j=j-1
offdia4(i,j)=-p1
p1=0;
i=i+1;j=j+1
end do
end do
do i=1,(n-m-1)
offdia4(i,i)=dia4(i,i);
end do
J4=offdia4
!!!!!!!
!!!!!!!!!!!!!!!!!!!formation of final jacobian!!!!!!!!!!
Jac( 1:38, 1:38) = J1 (1:38,1:38)
Jac( 1:38,39:67) = J2 (1:38,1:29)
Jac(39:67, 1:38) = J3 (1:29,1:38)
Jac(39:67,39:67) = J4 (1:29,1:29)
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
!print*,Jac(23,21)
CALL FindInv(Jac,invJac ,67, ErrorFlag)
trnsMF=transpose(MF)
deltaA=matmul( invJac, trnsMF)
do i=1,67
A(i)=A(i)+deltaA(i)
end do
!!!!!!!!!!!!updating values
do i=1,(n-1)
d1(i)=A(i)
end do
k=0
do i=n,(2*n-2-m)
k=1+k
v1(k)=A(i)
end do
end do
end program test
The array "Ps" contains some values. Now if I increase value of Ps(15) by Ps(15)+1 so for both values can I parallelize this code to get answer quickly.
I am using PGI compiler for CUDA FORTRAN.
Your code is fairly straightforward with lots of independent parallel loops. These parallel loops appear to be wrapped in an outer convergence do while loop, so as long as you keep the data on the device for all iterations of the convergence loop, you won't be bottlenecked by transfers.
I would recommend starting with compiler directives for this code rather than diving in to CUDA Fortran. Compiler directives work well for simple independent loops like these -- they are simple hints that you place in code comments that tell the compiler which loops to parallelize, which data to copy, etc.
You can first try OpenMP to accelerate to multiple CPU cores. Then you can use GPU directives such as OpenACC, which is going to be available soon in compilers from PGI, Cray, and CAPS. To get a head start, you could download a free trial of the PGI compiler and use their "Accelerator" directives. Accelerator is very similar in syntax to OpenACC.
Yes, you can use the PGI compiler to write CUDA kernels and make CUDA API calls.
PGI Fortran CUDA Homepage
The question, I think, you mean to ask is "Should I parallelize this code?"
My answer would be that yes you could see some mild benefits to parallelization, at a glance.
For example, segments like:
do i=2,n
do j=1,n
if(j .ne. i)then
p1=V(i)*V(j)*Yabs(i,j)*sin(angle(i,j)-d(i)+d(j))
!print*,p1
p=p1+p
end if
end do
i=i-1
dia1(i,i)=p
p=0
i=i+1
end do
Are an N^2 set of independent calculations (in this case you set n=39, but I assume it could change). Thus you're dealing with at least a couple hundred calculations. While ideally you'd want even MORE calculations in terms of parallelization, you're at least in good shape in terms of that many of your loops appear to be doing identical independent work # each step -- ideal for a threaded application.
Thus you could see some mild benefit to writing CUDA kernels to replace your looping code segments in your data post-processing algorithms. Beware, the latencies of the PCI bus in terms of memory transfers do nullify some of the performance gains, particularly for small systems.
Thus I would say, yes, by all means you can and should try this if you're game, but don't expect it to be 100x faster... maybe like 2-10x faster, if you code it well, depending on your loop bound size and level of divergence within the particular loops.
Worst case scenario you see no gains, or even see slowdown, but at least you've learned something!!