Timing Convergence: The Whack a Mole Game

Whack a Mole is an old (pre-electronic) arcade game. On a table surface there are a number of holes with moles hiding in them. A mole pops up out of its hole and you hit it with a hammer, causing it to retreat. The moment that mole returns to its hole, another random one pops up and you frantically hit mole after mole on the head until they give up and the table is clear. Achieving timing closure on complex chip designs bears a strong resemblance to this old game, but is less fun. As you fix each critical path problem, new ones are uncovered, or created.

Each new generation of FPGAs has given us higher performance and higher capacity. Designs have become larger and more complex; containing many clock domains, use of embedded multiply accumulation functions, embedded processors and a variety of memory resources. These changes have help propel FPGAs into many new applications. At the same time, predictability of timing in a Synthesis/Place and Route flow has degraded with each generation. The split of path delay between predictable cell delay and less predictable interconnect delay has shifted substantially with each generation. Interconnect is inherently less predictable because there are many different routing paths between a driver and a load with different delays. The fastest choices are usually the scarcest and routing congestion often leads to sub-optimal delays. Note that if the best routing resources weren’t scarce, then FPGAs would be much more expensive and power hungry.

Another source of unpredictability is the embedded IP that has been introduced in FPGAs over the past several generations. This IP includes memories of varying sizes, DSP accelerators (configurable multiply accumulate functions) and processors distributed in a non-uniform way across the FPGA die. Suppose you increase the size of a memory (responding to a feature request from marketing.) Now the synthesis tool changes what type of memory IP it maps to. Unfortunately, the new memory IP is only available in a couple of special columns on the FPGA die and the placement of your design is distorted from the original placement, stretching critical wires to that column and back.

In many cases a fix to a timing problem involves an RTL change. Usually, changes that improve timing increase area consumption. Consider a large design with multiple modules. Module A has been placed next to modules B and C. When we “fix” the RTL for module A to resolve a timing problem, it expands into the area that modules B and C were using. This forces components in B and C to move and stretch, often creating new critical paths. Note that nothing about the logic in modules B or C has changed and a logic synthesis flow would not change its estimate of interconnect delays in B or C just because module A increased in size. This is because the coupling between RTL changes and a new critical path in B or C is physical in nature.

So how do we win at the “Whack a Mole” game? Effects that lead to unpredictability in design iterations are physical in nature, which leads naturally to physical synthesis as a solution. When the RTL for module A is changed and it enlarges, stretching the wires in modules B and C, the new longer interconnect is correctly estimated and a combination of optimizations, placement and local routing along the new critical paths automatically fix the problem. Many of the moles are whacked for you and you never even see them try to pop up.

Physical synthesis usually results in a higher frequency result, but for many designers of complex chips the controlled, more predictable timing convergence of physical synthesis is of even higher value.

From The Syndicated Q4, 2006, published quarterly by Synplicity, Inc., www.synplicity.com.
Copyright © 2006 Synplicity, Inc. All rights reserved.