Video 3 of 4 · ~12 minutes
Dr. Mike Borowczak · Electrical & Computer Engineering · CECS · UCF
Every chip has PPA goals before design starts. “4 GHz @ 50W @ 50 mm²” is the sentence that opens an Apple CPU spec. The design process is then a structured hunt against those numbers: measure, optimize the worst metric, re-measure, decide when good enough. At Intel, this is formalized as the “PPA dashboard” tracked weekly; at AMD, it's the “area/power/frequency budget”; at every company, it's a methodology, not a vibe.
Your Day 10 PPA intro gave you the triangle. Today's methodology video makes it actionable: the three-step loop (measure → identify bottleneck → optimize), the reports you actually read, and the tradeoffs that matter for your capstone integration.
“I'll make the design as fast, as small, and as low-power as I can.”
You can't optimize all three. They are in active tension. The methodology is: measure all three, find the binding constraint, push on only that one. If your design is area-limited (won't fit on the iCE40 HX1K), don't chase Fmax. If you're meeting Fmax with 70% utilization, don't aggressively shrink — you'll hurt timing. PPA is navigation, not optimization.
$ make stat && make place
=== Synthesis (Yosys → iCE40) ===
SB_LUT4: 487 (38% of HX1K's 1280 LUTs)
SB_DFF: 124
SB_CARRY: 42
SB_RAM40_4K: 2 of 16 used
Total cells: 653
=== Placement + Routing (nextpnr) ===
Device: iCE40 HX1K-TQ144
Utilization: LUTs 38%, DFFs 9%, BRAM 12%
Max frequency: 43.2 MHz (target: 25 MHz) ✓
Critical path: 23.1 ns
SB_IO (i_data[0]) → 5.2 ns
SB_LUT4 (datapath[0]_cmp) → 7.4 ns
SB_CARRY chain (18 stages) → 8.9 ns
SB_DFF (result[31]) → 1.6 ns
Your UART + FIFO echo design reports:
Question: which metric is most likely to block the change, and what's the mitigation?
SB_RAM40_4K) instead of LUT-based FIFO. One 4 Kbit block can hold a 32-deep × 8-wide FIFO with zero LUTs. This is the textbook case for moving from distributed to block memory.
~6 minutes
▸ COMMANDS
cd labs/week4_day14/ex3_ppa/
for p in 0 1 2 4; do
make clean
make all PIPE=$p
grep 'Max freq\\|Cells' logs/report.txt
done
python scripts/plot_ppa.py logs/*.txt
▸ EXPECTED STDOUT
PIPE=0: 44 MHz, 98 cells
PIPE=1: 82 MHz, 132 cells
PIPE=2: 121 MHz, 167 cells
PIPE=4: 143 MHz, 245 cells
# Fmax vs cells — diminishing
# returns past PIPE=2
▸ THE PARETO FRONTIER
Plot Fmax vs. cells — you get a classic Pareto curve. Beyond PIPE=2 the curve flattens: you're spending more area for less speed. The knee of the curve is almost always the right answer. This is what an engineer means by “I found the sweet spot.”
| Goal | Move | Cost |
|---|---|---|
| ↑ Fmax | Add pipeline register in critical path | +1 flop / stage, +1 cycle latency |
| ↑ Fmax | Use SB_CARRY chain instead of generic LUTs for adders | Already automatic; check synthesis |
| ↓ Area | Move FIFOs/ROMs to SB_RAM40_4K BRAM | Only works for ≥8×8 storage |
| ↓ Area | Share operators across mutually exclusive uses | Extra muxes, possible Fmax hit |
| ↓ Power | Clock-gate idle logic | Requires careful timing; iCE40 tricky |
Ask AI: “Here's my synthesis report: 72% LUT usage, Fmax 28 MHz, target 50 MHz. My critical path is a 32-bit ripple adder. Give me 3 optimization strategies ordered by expected Fmax gain per area cost.”
TASK
AI ranks PPA moves by ROI.
BEFORE
Predict: pipeline (cheapest), carry-select (middle), Kogge-Stone (expensive).
AFTER
Strong AI gives expected Fmax numbers + area deltas. Weak AI just lists moves without quantification.
TAKEAWAY
Good optimization advice has numbers attached. Without them it's just opinions.
① PPA is navigation, not unbounded optimization. Measure, identify binding constraint, target.
② Every optimization trades one metric against another. No free wins.
③ The Pareto curve's knee is almost always the right answer.
④ Five moves (pipeline, carry chain, BRAM, operator sharing, clock gating) cover most cases.
🔗 Transfer
Video 4 of 4 · ~10 minutes
▸ WHY THIS MATTERS NEXT
You've learned the tools of verification. Video 4 addresses the completeness question: how do you know you've verified enough? Coverage analysis answers this. And then we look at where the field is going — formal verification, UVM, HLS, open-source silicon, and what HDL careers look like in 2026 and beyond.