Video 4 of 4 · ~9 minutes
Dr. Mike Borowczak · Electrical & Computer Engineering · CECS · UCF
Every LED matrix display — the ones in elevators, billboards, stadium scoreboards — is a ROM-driven pattern sequencer. Every video card has a character ROM for text overlays. Every phone has sine tables for audio synthesis. Every modem has constellation ROMs for symbol decoding. The “counter addresses ROM” pattern is the canonical architecture for any computed-then-replayed signal.
Your Day 11 UART HELLO demo uses this pattern to drive the character stream. Your capstone melody player (a common project) uses it for sine tables. Your Go Board's LED patterns, 7-seg sequences, and scrolling text all follow the same shape.
“Runtime computation is always better than a lookup table — it uses less memory and is more 'clever.'”
Runtime compute costs logic (LUTs, DSPs). Table lookup costs memory (EBRs). FPGAs have abundant memory that would otherwise be wasted. For fixed functions — sine, log, gamma correction, CRC tables — look up beats compute every time. The choice isn't about cleverness; it's about resource allocation.
module pattern_sequencer #(
parameter STEP_LEN = 10_000_000, // cycles per pattern step (0.4s @ 25 MHz)
parameter PATTERN_LEN = 16, // number of steps in sequence
parameter INIT_FILE = "pattern.hex"
) (
input wire i_clk, i_reset,
output wire [7:0] o_leds
);
localparam STEP_W = $clog2(STEP_LEN);
localparam ADDR_W = $clog2(PATTERN_LEN);
// Timer: counts STEP_LEN cycles, then pulses step_tick
reg [STEP_W-1:0] r_step_counter;
wire w_step_tick = (r_step_counter == STEP_LEN - 1);
always @(posedge i_clk) r_step_counter <= i_reset ? 0 : (w_step_tick ? 0 : r_step_counter + 1);
// Address counter: advances on each step_tick
reg [ADDR_W-1:0] r_addr;
always @(posedge i_clk) r_addr <= i_reset ? 0 : (w_step_tick ? r_addr + 1 : r_addr);
// ROM: 16 × 8 patterns
rom_array #(.ADDR_W(ADDR_W), .DATA_W(8), .INIT_FILE(INIT_FILE))
u_rom (.i_clk(i_clk), .i_addr(r_addr), .o_data(o_leds));
endmodule
pattern.hex → different animation. No RTL changes needed.
For an audio synthesizer, we need sin(x) at 1024 points per cycle with 10-bit precision:
// Python: generate the table at build time
// for i in range(1024): print(f"{int((math.sin(2*math.pi*i/1024)+1)/2 * 1023):03x}")
module sine_rom (
input wire i_clk,
input wire [9:0] i_phase, // 0..1023 (full period)
output reg [9:0] o_sample // 0..1023 (scaled)
);
reg [9:0] mem [0:1023];
initial $readmemh("sine_1024x10.hex", mem);
always @(posedge i_clk) o_sample <= mem[i_phase];
endmodule
Your 7-seg display needs to show hex digits 0-F. Input: 4-bit value. Output: 7-bit segment pattern. Is this a good ROM candidate?
~5 minutes
▸ COMMANDS
cd labs/week3_day09/ex4_sequencer/
cat pattern.hex # 16 × 8-bit patterns
make sim
make prog
# watch LEDs cycle
# edit pattern.hex, reprogram,
# new animation — no RTL change
▸ EXPECTED BEHAVIOR
pattern.hex = walk right:
01 → 02 → 04 → 08 → 10 → 20 → 40 → 80 → 80 → 40 ...
Board shows 8 LEDs cycling
left to right, pattern
wraps every 6.4 sec
(16 steps × 0.4s)
▸ KEY OBSERVATION
Edit the hex file, reprogram — new animation. Content and architecture are decoupled. Adding a new pattern = 30 seconds of editing + 10 seconds of programming. No synthesis, no Verilog changes. This is the elegance of lookup-based designs.
| Application | Pattern | Memory Cost |
|---|---|---|
| Audio sine synth | phase counter → sine ROM | 1-3 EBRs |
| 7-seg message scroller | char counter → char ROM → 7seg decoder | 1-2 EBRs |
| Microcoded controller | state counter → microcode ROM | 1-4 EBRs (depending on instruction width) |
| CRC lookup | byte in → CRC table → XOR accumulator | 1 EBR (256×16 table) |
| Gamma correction | pixel → gamma ROM | 1-2 EBRs |
| Font ROM (VGA text) | char + row → font ROM → pixel stream | 1-2 EBRs (128 chars × 8×8 pixels) |
Ask AI: “Design a Verilog pattern sequencer that drives 8 LEDs with a heartbeat pattern (pulse, pulse, pause, pulse, pulse, pause) at 1 Hz on a 25 MHz clock. Include Python code to generate the hex file.”
TASK
AI designs complete sequencer + table.
BEFORE
Predict: timer + address counter + array ROM + Python hex generator.
AFTER
Strong AI pairs Verilog and Python. Weak AI writes only Verilog with hard-coded tables.
TAKEAWAY
Separation of content and architecture = Python + Verilog.
① Pattern sequencer = timer + address counter + ROM.
② Look up beats compute for fixed functions — memory is abundant, logic is dear.
③ Tiny tables (< 100 bits) → LUTs. Large tables (> few hundred bits) → EBRs.
④ Content (hex files) and architecture (RTL) should be decoupled.
Q1: What triggers block RAM inference on iCE40?
initial $readmemh (for ROM) or sync write + sync read (for RAM). The key: synchronous read via a registered output.Q2: How many EBRs does a 768×8 RAM consume?
Q3: Why is lookup usually better than runtime compute on FPGAs?
Q4: When does the case-ROM pattern still make sense?
🔗 End of Day 9
Day 10 · Timing, Numerics, Performance-Power-Area
▸ WHY THIS MATTERS NEXT
You now know what your chip has (LUTs + EBRs) and how to use memory idioms. Day 10 answers the next question: how fast can it go, and what does it cost? You'll learn to read timing reports, understand numerical architecture tradeoffs, and measure PPA (Performance / Power / Area) — the three axes every real design is evaluated on.