How to not synthesize memory for release in Chisel - chisel

I'm writing an architecture that makes extensive internal use of SyncReadMem, which is intended to represent SRAM memory banks. Our current synthesis toolchain, however, does not support SRAM properly, but is fine for registers and computational logic. So, whenever we're running a synthesis build, I pass in a flag that disables the elaboration of any SyncReadMem modules and treats them purely as IO signals:
class exampleModule (synthesis: Boolean) extends Module {
val io = IO(new Bundle {
val in = Input(UInt(32.W))
val out = Output(Uint(32.W))
val synth_bundle = if (synthesis) Some(new Bundle() {
val synth_out = Output(UInt(32.W))
val synth_in = Input(UInt(4.W))
}) else None
}
val mem_read = if (synthesis) {
val memory = SyncReadMem(1 << 32, UInt(4.W))
memory.read(io.in)
} else {
io.synth_bundle.get.synth_out := io.in
io.synth_bundle.get.synth_in
}
io.out := mem_read * 2.U
}
This, to my understanding, should properly elaborate all of my logic (ie not optimize anything out that I want to be there), and won't elaborate any of the memory whenever I have synthesis builds enabled.
The problem I'm running into is that for really hierarchical modules where a deeply ingrained module needs something like this, it requires every module above it to implement this synth_bundle style IO, requiring a bit of writing and adding no functionality. Is there some easier / more canonical way to do this?
Thank you

Related

Chisel3 REPL Vec assignment into module only works after eval

If we run the following Chisel3 code
class Controller extends Module {
val io = IO(new Bundle {
})
val sff = Module(new SFF)
val frame: Vec[UInt] = Reg(Vec(ProcedureSpaceSize, Integer32Bit))
for(i <- 0 until ProcedureSpaceSize)
frame(i) := 99.U
sff.io.inputDataVector := frame
}
class SFF extends Module {
val io = IO(new Bundle {
val inputDataVector: Vec[UInt] = Input(Vec(ProcedureSpaceSize, Integer32Bit))
})
}
in REPL debug mode. First do
reset;step
peek sff.io_inputDataVector_0;peek sff.io_inputDataVector_1;peek sff.io_inputDataVector_2
The REPL returns
Error: exception Error: getValue(sff.io_inputDataVector_0) returns value not found
Error: exception Error: getValue(sff.io_inputDataVector_1) returns value not found
Error: exception Error: getValue(sff.io_inputDataVector_2) returns value not found
Then do
eval sff.io_inputDataVector_0
which will be a success, yielding
...
resolve dependencies
evaluate sff.io_inputDataVector_0 <= frame_0
evaluated sff.io_inputDataVector_0 <= 99.U<32>
Then perform the above peek again
peek sff.io_inputDataVector_0;peek sff.io_inputDataVector_1;peek sff.io_inputDataVector_2;
This time, it returns
peek sff.io_inputDataVector_0 99
peek sff.io_inputDataVector_1 99
peek sff.io_inputDataVector_2 99
which is more expected.
Why does the REPL act in this way? Or was there something I missed? Thanks!
*chisel-iotesters is in version 1.4.2, and chiseltest is in version 0.2.2. Both should be the newest version.
The firrtl interpreter REPL does not necessarily compute or store values that on a branch of a mux that is not used. This can lead to problems noted above like
Error: exception Error: getValue(sff.io_inputDataVector_0) returns value not found.
eval can be used to force unused branches to be evaluated anyway. The REPL is an experimental feature that has not had a lot of use.
treadle is the more modern chisel scala-based simulator. It is better supported and faster than the interpreter. It has a REPL of its own, but does not have an executeFirrtlRepl equivalent.
It must be run from the command line via the ./treadle.sh script in the root directory. One can also run sbt assembly to create a much faster launching jar that is placed in utils/bin. This REPL also has not been used a lot but I am interested on feedback that will make it better and easier to use.
I think the problem that you are seeing is that all your wires are being eliminated due to dead code elimination. There are a couple of things you should try to fix this.
Make sure you have meaningful connections of your wires. Hardware that does not ultimately affect an output is likely to get eliminated. In your example you do not have anything driving a top level output
You probably need your circuit to compute something with those registers. If the registers are initialized to 99 then constant propagation will likely eliminate them. I'm not sure what you are trying to get the circuit to do so it is hard to make a specific recommendation.
If you get the above done I think the repl will work as expected. I do have a question about which repl you are using (there are two: firrtl-interpreter and treadle) I recommend using the latter. It is more modern and better supported. It also has two commands that would be useful
show lofirrtl will show you the lowered firrtl, this is how you can see that lots of stuff from the high firrtl emitted by chisel3 has been changed.
symbol . shows you all symbols in circuit (. is a regex that matches everything.
Here is a somewhat random edit of your circuit that drives and output based on your frame Vec. This circuit will generate firrtl that will not eliminated the wires you are trying to see.
class Controller extends Module {
val io = IO(new Bundle {
val out = Output(UInt(32.W))
})
val sff = Module(new SFF)
val frame: Vec[UInt] = Reg(Vec(ProcedureSpaceSize, Integer32Bit))
when(reset.asBool()) {
for (i <- 0 until ProcedureSpaceSize) {
frame(i) := 99.U
}
}
frame.zipWithIndex.foreach { case (element, index) => element := element + index.U }
sff.io.inputDataVector := frame
io.out := sff.io.outputDataVector.reduce(_ + _)
}
class SFF extends Module {
val io = IO(new Bundle {
val inputDataVector: Vec[UInt] = Input(Vec(ProcedureSpaceSize, Integer32Bit))
val outputDataVector: Vec[UInt] = Output(Vec(ProcedureSpaceSize, Integer32Bit))
})
io.outputDataVector <> io.inputDataVector
}

Chisel Bundle connection and type Safety

Just noticed you can do something like:
class MyBundle1 extends Bundle {
val a = Bool()
val x = UInt(2.W)
val y = Bool()
}
class MyBundle2 extends Bundle {
val x = Bool()
val y = Bool()
}
class Foo extends Module {
val io = IO(new Bundle {
val in = Input(new MyBundle1)
val out = Output(new MyBundle2)
})
io.out := io.in
}
not get an error and actually get the correct or, at least, expected verilog. One would think that Chisel was type-safe wrt to Bundles in this sort of bulk connection. Is this intentional? If so, is there any particular reasons for it?
The semantics of := on Bundle allow the RHS to have fields that are not present on the LHS, but not vice-versa. The high-level description of x := y is "drive all fields of x with corresponding fields of y."
The bi-directional <> operator is stricter.
I can't speak precisely to the reasoning, but connection operators are often a matter of taste with respect to "do what I mean." The idea of including alternate forms of connection operators for future releases is an ongoing discussion, with the threat of "operator glut" being weighed against the ability of users to specify exactly what they'd like to do.

Chisel test - internal signals

I would like to test my code, so I'm doing a testbench. I wanted to know if it was possible to check the internal signals -like the value of the state register in this example- or if the peek was available only for the I/O
class MatrixMultiplier(matrixSize : UInt, cellSize : Int) extends Module {
val io = IO(new Bundle {
val writeEnable = Input(Bool())
val bufferSel = Input(Bool())
val writeAddress = Input(UInt(14.W)) //(matrixSize * matrixSize)
val writeData = Input(SInt(cellSize.W))
val readEnable = Input(Bool())
val readAddress = Input(UInt(14.W)) //(matrixSize * matrixSize)
val readReady = Output(Bool())
val readData = Output(SInt((2 * cellSize).W))
})
val s_idle :: s_writeMemA :: s_writeMemB :: s_multiplier :: s_ready :: s_readResult :: Nil = Enum(6)
val state = RegInit(s_idle)
...
and for the testbench :
class MatrixUnitTester(matrixMultiplier: MatrixMultiplier) extends PeekPokeTester(matrixMultiplier) { //(5.asUInt(), 32.asSInt())
println("State is: " + peek(matrixMultiplier.state).toString) // is it possible to have access to state ?
poke(matrixMultiplier.io.writeEnable, true.B)
poke(matrixMultiplier.io.bufferSel, false.B)
step(1)
...
EDIT : Ok, with VCD + GTKWave it is possible to graphically see these variables ;)
Good question. There's several parts to this answer
The Chisel supplied unit testing frameworks older chisel-testers and the newer chiseltest. Do not provide a mechanism to look into the wires directly.
Currently the chisel team is looking into ways of doing that.
Both provide indirect ways of doing it. Writing VCD output and using printf to see internal values
The Treadle firrtl simulator, which can directly simulate a firrtl (the direct output of the Chisel compiler) does allow for peek, and poking any signal directly. There are lots of examples of how its use in Treadle's unit tests. Treadle also provides a REPL shell which can be useful for exploring a circuit with manual peeks and pokes
The older chiseltesters (io-testers) and current chiseltest frameworks allow debugging the signal values with .peek() function that works well for the interface signals.
I haven't found a way to peek() an internal signal while debugging a testcase. However, Treadle simulator can dump the values of internal signals when it is running in verbose mode:
Add the annotation treadle.VerboseAnnotation to the test:
`test(new DecoupledGcd(16)).withAnnotations(Seq(WriteVcdAnnotation, treadle.VerboseAnnotation))`
When debugging in the IDEA and the test stops at breakpoint, the changes in the values of all internal signals up to this point are dumped to the Console.
This example will also generate the VCD wave file for further debugging.

chisel-firrtl combinational loop handling

I have met some troubles while simulating design that contains comb-loop. Firrtl throws exception like
"No valid linearization for cyclic graph"
while verilator backend goes normal with warnings.
Is it possible to simulate such design with firrtl backend?
And can we apply --no-check-comb-loops not for all design but for some part of it while elaborating?
Example code here:
import chisel3._
import chisel3.iotesters.PeekPokeTester
import org.scalatest.{FlatSpec, Matchers}
class Xor extends core.ImplicitModule {
val io = IO(new Bundle {
val a = Input(UInt(4.W))
val b = Input(UInt(4.W))
val out = Output(UInt(4.W))
})
io.out <> (io.a ^ io.b)
}
class Reverse extends core.ImplicitModule {
val io = IO(new Bundle {
val in = Input(UInt(4.W))
val out = Output(UInt(4.W))
})
io.out <> util.Reverse(io.in)
}
class Loop extends core.ImplicitModule {
val io = IO(new Bundle {
val a = Input(UInt(4.W))
val b = Input(UInt(4.W))
val mux = Input(Bool())
val out = Output(UInt(4.W))
})
val x = Module(new Xor)
val r = Module(new Reverse)
r.io.in <> Mux(io.mux, io.a, x.io.out)
x.io.a <> Mux(io.mux, r.io.out, io.a)
x.io.b <> io.b
io.out <> Mux(io.mux, x.io.out, r.io.out)
}
class LoopBackExampleTester(cc: Loop) extends PeekPokeTester(cc) {
poke(cc.io.mux, false)
poke(cc.io.a, 0)
poke(cc.io.b, 1)
step(1)
expect(cc.io.out, 8)
}
class LoopBackExample extends FlatSpec with Matchers {
behavior of "Loop"
it should "work" in {
chisel3.iotesters.Driver.execute(Array("--no-check-comb-loops", "--fr-allow-cycles"), () => new Loop) { cc =>
new LoopBackExampleTester(cc)
} should be(true)
}
}
I will start by noting that Chisel is intended to make synchronous, flop-based, digital design easier and more flexible. It is not intended to represent all possible digital circuits. Fundamentally, Chisel exists to make the majority of stuff easier while leaving things that tend to be more closely coupled to implementation technology (like Analog) to Verilog or other languages.
Chisel (well FIRRTL) does not support such apparent combinational loops even if it possible to show that the loop can't occur due to the actual values on the mux selects. Such loops break timing analysis in synthesis and can make it difficult to create a sensible circuit. Furthermore, it isn't really true that the loop "can't occur". Unless there is careful physical design done here, there will likely be brief moments (a tiny fraction of a clock cycle) where there will be shorts which can cause substantial problems in your ASIC. Unless you are building something like a ring oscillator, most physical design teams will ask you not to do this anyway. For the cases where it is necessary, these designs typically are closely tied to the implementation technology (hand designed with standard cells) and as such are not really within the domain of Chisel.
If you need such a loop, you can express it in Verilog and instantiate the design as a BlackBox in your Chisel.

Chisel/FIRRTL constant propagation & optimization across hierarchy

Consider a module that does some simple arithmetic and is controlled by a few parameters. One parameter controls the top level behavior: the module either reads its inputs from its module ports, or from other parameters. Therefore, the result will either be dynamically computed, or statically known at compile (cough, synthesis) time.
The Verilog generated by Chisel has different module names for the various flavors of this module, as expected. For the case where the result is statically known, there is a module with just one output port and a set of internal wires that are assigned constants and then implement the arithmetic to drive that output.
Is it possible to ask Chisel or FIRRTL to go further and completely optimize this away, i.e. in the next level of hierarchy up, just replace the instantiated module with its constant and statically known result? (granted that these constant values should by optimized away during synthesis, but maybe there are complicated use cases where this kind of elaboration time optimization could be useful).
For simple things that Firrtl currently knows how to constant propagate, it already actually does this. The issue is that it currently doesn't const prop arithmetic operators. I am planning to expand what operators can be constant propagated in the Chisel 3.1 release expected around New Years.
Below is an example of 3.0 behavior constant propagating a logical AND and a MUX.
import chisel3._
class OptChild extends Module {
val io = IO(new Bundle {
val a = Input(UInt(32.W))
val b = Input(UInt(32.W))
val s = Input(Bool())
val z = Output(UInt(32.W))
})
when (io.s) {
io.z := io.a & "hffff0000".U
} .otherwise {
io.z := io.b & "h0000ffff".U
}
}
class Optimize extends Module {
val io = IO(new Bundle {
val out = Output(UInt())
})
val child = Module(new OptChild)
child.io.a := "hdeadbeef".U
child.io.b := "hbadcad00".U
child.io.s := true.B
io.out := child.io.z
}
object OptimizeTop extends App {
chisel3.Driver.execute(args, () => new Optimize)
}
The emitted Verilog looks like:
module Optimize(
input clock,
input reset,
output [31:0] io_out
);
assign io_out = 32'hdead0000;
endmodule