Video 2 of 4 · ~14 minutes
Dr. Mike Borowczak · Electrical & Computer Engineering · CECS · UCF
The first performance review on any RTL codebase is an operator audit: where are the multiplies? The dividers? The wide comparators? These drive area, timing, and power. Senior designers read code and see gates.
Your Day 3 ALU uses +, -, &, |. Day 9 memory addressing uses comparators. Day 11 UART uses reduction operators for parity. Every lab after today uses this vocabulary.
“One operator = one operation. a + b and a & b look the same, so they cost the same.”
Each operator has a hardware footprint. & on 32 bits = 32 LUTs. + on 32 bits = 32 LUTs + 32 carry cells (a ripple chain with real delay). * on 32 bits = hundreds of LUTs or a dedicated DSP block.
| Category | Operators | Hardware Cost | On iCE40 HX1K |
|---|---|---|---|
| Bitwise | & | ^ ~ | 1 LUT per bit | Cheap |
| Logical | && || ! | Reduction + 1 LUT | Cheap |
| Reduction (unary) | & | ^ | Tree of LUTs | Cheap (log N) |
| Arithmetic +/- | + - | Adder chain (carry) | Moderate (N LUTs + N SB_CARRY) |
| Arithmetic * | * | Multiplier tree | Expensive (~N2 LUTs) |
| Relational | == < > | Comparator | Moderate |
| Shift (constant) | << 3 | Rewiring only | Free |
| Shift (variable) | << n | Barrel shifter | Expensive |
| Conditional | ? : | 2:1 mux | Cheap (1 LUT) |
wire [3:0] a = 4'b1010;
wire [3:0] b = 4'b0101;
wire [3:0] w_bitwise = a & b; // = 4'b0000 (per-bit AND)
wire w_logical = a && b; // = 1'b1 (both nonzero → true)
& operates on every bit independently, result same width as operands. Double && treats each operand as a boolean (any bit set = true), result always 1 bit. Mixing them up is a classic bug — especially with C/Java muscle memory.
// 2:1 mux
assign y = sel ? a : b;
// 4:1 mux — fill in:
assign y = sel[1] ? ( sel[0] ? /* ? */ : /* ? */ )
: ( sel[0] ? /* ? */ : /* ? */ );
sel[1] ? (sel[0] ? d : c) : (sel[0] ? b : a). Selects: 00→a, 01→b, 10→c, 11→d. Yosys will build 3 muxes in a tree — 2 at the first level, 1 at the root.
For each expression on 8-bit buses, rank by iCE40 LUT cost (low/med/high):
assign x = a & b;assign x = a + b;assign x = a * b;assign x = a << 3;assign x = a << n; (where n is a 3-bit wire)assign x = (a > 8'd100);~5 minutes
▸ COMMANDS
cd labs/week1_day02/ex2_mux_hierarchy/
make sim # iverilog + vvp
make wave # GTKWave
make stat # yosys synth_ice40
▸ EXPECTED STDOUT
PASS: 2:1 mux sel=0 → b
PASS: 2:1 mux sel=1 → a
PASS: 4:1 mux sel=00 → a
PASS: 4:1 mux sel=11 → d
=== 16 passed, 0 failed ===
▸ GTKWAVE — WHAT TO LOOK FOR
Signals: sel · a · b · c · d · y. Set all to hex. Watch y react instantly (delta cycle) to any change on sel or the data inputs — that's the signature of pure combinational logic.
Three 8-bit modules, side by side:
| Module | Body | SB_LUT4 | SB_CARRY | Verdict |
|---|---|---|---|---|
bitwise_and | y = a & b; | 8 | 0 | Cheap |
adder | y = a + b; | 8 | 8 | Moderate |
multiplier | y = a * b; | ~80 | ~24 | Expensive |
yosys -p "read_verilog op_compare.v; synth_ice40; stat" -q. Edit the module, rerun stat. Watch the numbers change with code.
Ask AI: “Rank these on iCE40 LUT cost: a+b, a*b, a<<3, a<<n, a==b, all 16-bit.”
TASK
Ask for LUT cost ranking on 16-bit operands.
BEFORE
Predict: constant shift free → == cheap → + mod → barrel shift exp → * most.
AFTER
AI ordering usually correct. Absolute counts often 2× off — verify with Yosys.
TAKEAWAY
AI gives good ordinal rankings. Trust the ranking. Verify absolute numbers with stat.
stat is ground truth. AI estimates are useful for early rough sizing.
① Bitwise (&) = per-bit. Logical (&&) = 1-bit true/false.
② ? : is the mux. Nest for wider muxes.
③ Constant shifts are free. Variable shifts are expensive.
④ Multiply costs ~N² LUTs. Always check utilization.
🔗 Transfer
Video 3 of 4 · ~8 minutes
▸ WHY THIS MATTERS NEXT
You just saw that 8+8 can cost 8 LUTs + 8 carry cells. But what's the bit-width of the result? 8? 9? Here's a puzzle: 4'd15 + 4'd1 gives 0, not 16 — unless you size the result correctly. Video 3 shows you why, and how to stop silent overflow bugs.