Video 2 of 4 · ~12 minutes
Dr. Mike Borowczak · Electrical & Computer Engineering · CECS · UCF
Nonblocking assignment is the basis of every synchronous pipeline in existence — every CPU, every networking switch, every GPU. When Intel simulates a Xeon CPU, every flop in the design uses <=. When TSMC verifies their standard-cell library, every registered output uses <=.
Day 5 shift registers live or die by <=. Day 7 FSM state transitions demand <=. Your Day 11 UART framing pipeline is 6 <= deep. The rule is absolute — today's job is to build deep intuition via waveforms.
“<= is just assignment. Maybe with a slight delay? The variables get their new values as the lines execute.”
<= is not assignment at all — it's scheduling. Each <= line says “at the end of this timestep, please update the LHS to this value.” All the RHS expressions are evaluated using pre-edge values. Only after the whole block finishes do the updates apply — all at once, atomically.
Step 1: On the clock edge, evaluate all RHS expressions using current (pre-edge) values.
Step 2: Schedule all updates. Nothing has changed yet.
Step 3: Apply all scheduled updates simultaneously at the end of the timestep.
// WRONG: blocking
always @(posedge clk) begin
b = a; // b gets a immediately
c = b; // c gets the NEW b (= a!)
end
// Result: b=a, c=a — no pipeline at all
// CORRECT: nonblocking
always @(posedge clk) begin
b <= a; // scheduled: b ← a(current)
c <= b; // scheduled: c ← b(current)
end
// Result: b=a(old), c=b(old) — proper 2-stage pipeline
Given a toggles 1,0,1,0,1,… each cycle. Starting values: b=c=0. Trace b, c for 4 edges using nonblocking:
always @(posedge clk) begin
b <= a;
c <= b;
end
cycle 1: a=1 → b=1, c=0
cycle 2: a=0 → b=0, c=1
cycle 3: a=1 → b=1, c=0
cycle 4: a=0 → b=0, c=1
Notice: c is a delayed by 2 cycles. That's the pipeline signature.
b=a, then c=b=a every cycle — c matches a with zero delay. Pipeline collapsed.
~5 minutes
▸ COMMANDS
cd labs/week1_day04/ex2_pipeline_demo/
# Two modules: pipe_blocking, pipe_nonblocking
make sim
make wave # loads saved .gtkw
▸ EXPECTED STDOUT
BLOCKING:
t=10 a=1 b=1 c=1
t=20 a=0 b=0 c=0
(b and c always match a)
NONBLOCKING:
t=10 a=1 b=1 c=0
t=20 a=0 b=0 c=1
(c lags by 1 cycle)
▸ GTKWAVE
Two traces stacked. Blocking: a, b, c all change together on each edge (pipeline collapsed). Nonblocking: staircase pattern — a leads, b follows by 1, c follows by 2. This is the pipeline visible.
$ yosys ... blocking.v
SB_DFF: 1 ← ONE flop
SB_LUT4: 0
(synthesizer proves b==c
and optimizes away.
Result: input → 1 flop → output)
$ yosys ... nonblocking.v
SB_DFF: 2 ← TWO flops
SB_LUT4: 0
(proper 2-stage pipeline)
Ask AI: “Explain, using the term 'active event queue', why nonblocking assignments prevent race conditions in sequential Verilog.”
TASK
Ask AI about simulator event queue semantics.
BEFORE
Predict: NBA events go to an NBA region that fires after the active region, ensuring atomic commit.
AFTER
Strong AI explains the active→NBA→inactive regions. Weak AI just says “it's delayed.”
TAKEAWAY
This is IEEE 1364 §5 material. The reference for anyone who wants deep scheduling knowledge.
= in always @(*) — combinational<= in always @(posedge clk) — sequential① <= evaluates all RHS first, then updates simultaneously.
② This models real flip-flop behavior — simultaneous capture.
③ = in sequential blocks destroys pipeline behavior.
④ The rule is absolute: = for @(*), <= for @(posedge).
🔗 Transfer
Video 3 of 4 · ~10 minutes
▸ WHY THIS MATTERS NEXT
The bare D-flop is rare in practice. Real designs have reset (to initialize state), enable (to conditionally update), and choices about synchronous vs asynchronous reset. Video 3 covers the patterns you'll see in 99% of production RTL.