Integration of Synplify DSP Algorithms with Xilinx Embedded
Systems
Chris Eddington, Senior Technical Marketing
Manager, Synplicity, Inc.
Introduction
A common problem facing IC and FPGA designers today is implementing
DSP algorithms into their technology of choice, whether it be an Altera Stratix
III, Xilinx Virtex 5, or space-qualified Actel Axcelerator device. Synplify
DSP software offers one of the fastest ways to capture, analyze, and verify
the mathematical behavior of a DSP algorithm, and then automatically implement
it using optimized architectures into any device technology. As FPGA customers
face upgrading their designs from past generations of devices, this high level
of modeling and portability becomes very attractive. Synplify DSP software can
achieve up to 10X productivity for DSP algorithm implementation.
When compared to other high-level DSP design tools, Synplify DSP
offers significant productivity, portability, and verification advantages. Using
a “DSP Synthesis” methodology and a powerful architectural optimization
engine, Synplify DSP eliminates the need to hand-tune pipelining, retiming,
and resource sharing in the model. In contrast, other high-level DSP design
tools use more of a schematic capture approach with parameterized Simulink blocks
which require the designer to build optimizations into the algorithm model.
This adds significantly more time and effort, especially when considering the
additional verification required to be sure the optimizations are done correctly.
If the model is intended to be reused with different technologies, or there
is a need to explore speed-grades or lower cost devices, then a DSP Synthesis
methodology will easily yield even larger improvement in the effort required.
One of the important considerations when creating DSP algorithm
designs is how to integrate them with larger system elements such as processors,
on-chip busses, and memory controllers. DSP algorithms are often integrated
with embedded processors systems. For example, it is common for wireless products
to implement the physical layer (PHY) as a peripheral in an embedded processor
and the Multiple-Access Layer (MAC) functions are implemented as software.
There are two approaches to integrating algorithms implemented
using the Synplify DSP tool with system level IP:
1) RTL Integration: Integrate the Synplify DSP Output into other RTL tool flows
2) Model Integration: Integrate System IP into Synplify DSP models
The second approach has obvious advantages in that a unified,
high-level model is clearly easier to debug and verify. However, a mixture of
these approaches also has advantages and is commonly used. For example, a DSP
algorithm might best be modeled and verified along with a peripheral interface
bus, then using the Synplify DSP synthesis engine to create optimized RTL that
can plug directly into an embedded processor development framework.
This article briefly outlines the first approach using the Xilinx
System GeneratorTM tool to integrate Synplify DSP designs into the Xilinx embedded
design flow. Synplify DSP includes an export feature that automatically creates
a blackbox that will work in System Generator. This can have great benefits
to users who want to leverage the system IP and Hardware-in-the-Loop (HIL) capabilities
of Xilinx hardware and still gain the benefits of a DSP Synthesis methodology.
The Benefits of Abstraction in Design Capture
Synplify DSP software allows designers to describe their algorithm
behavior without the need to specify the implementation architecture. The focus
is mathematically “sample-accurate” behavior without the need to
specify clocks, resets, target devices, memory or multiplier implementation.
Architecture and resource mapping is done by the DSP synthesis engine based
on the user’s speed and area constraints, and it applies these across
the entire design. In addition to this higher level of abstraction, the Synplify
DSP tool also provides vector support to easily create highly parallel or multichannel
designs, broad multi-rate support, and also makes it easy to analyze and choose
quantization settings.
Figure 1 shows an example design using Synplify DSP to model a
20-tap Adaptive Filter using the LMS algorithm. The use of vector signals to
capture the 20-element dimensional signals make it very simple to capture and
model the behavior. In this case all 20-taps are updated every clock cycle using
a vector multiplied by an error factor, requiring 20 multiplies in addition
to the 20 required for the filter taps. For this example a sample rate of 10
Mhz is specified. If the target is a high speed device, for instance a Virtex
5, the user might, at this point, start modifying the architecture to share
multipliers and other resources to exploit faster clocks and reduce area.
Figure 1. Concise Description of a 20-tap LMS
Adaptive Filter Using Synplify DSP Vector Signals
However, when using the Synplify DSP synthesis engine, the designer
can automatically apply folding optimizations (also called resource sharing)
and explore the cost tradeoffs. Figure 2 shows how various folding optimizations
result in significant area reduction on a Virtex 5 device. At a 10 Mhz sample
rate, the synthesis engine inferred higher clocks and automatically created
resource sharing architectures for the adaptive filter. As higher folding factors
are applied, multiplier use went from 32 down to 4 and the overall LUT count
is reduced over 80% (at folding factor = 20). However, the cost incurred is
a higher 200 Mhz clock rate requiring more registers to meet timing. It is possible
to continue beyond a folding factor of 20 for further reductions, but with diminishing
returns at the expense of not meeting the timing of a higher clock rate. The
Synplify DSP tool enables this kind of exploration from a single algorithm model
and automates the implementation into RTL.
Figure 2. Exploration of Folding for the Adaptive
FIR example.
System Integration with Xilinx System Generator Tool
At this stage the designer will likely need to integrate this
design into a processor subsystem. For Xilinx users some of the common choices
are a Microblaze or embedded PowerPC processor. Often this type of integration
is done with the Xilinx System Generator (SysGen) tool. Synplify DSP now makes
it much easier to integrate its RTL output into System Generator. Using the
Export to System Generator option, Synplify DSP will automatically create a
blackbox model that can be instantiated into System Generator designs and co-simulated
with other system models that SysGen supports.
Figure 3 shows how easy this is. The SysGen export feature is
an additional checkbox available in the RTL output configuration of the Synplify
DSP user interface. This is available for both Verilog and VHDL outputs when
Xilinx devices are selected.
Figure 3. Synplify DSP's Export to SysGen feature
turned on for VHDL
Figures 4 and 5 show the blackbox model is integrated and configured
for use in SysGen models. Synplify DSP creates a model file that underneath
instantiates a SysGen blackbox and the ModelSim block. These are all configured
appropriately including the necessary configuration M-script required by the
SysGen RTL blackboxes. This blackbox model can be copied into existing SysGen
designs or you can start adding system components to interface to the new Synplify
DSP optimized algorithm.
Figure 4. SysGen blackbox model generated by
Synplify DSP includes all necessary configuration elements.
Figure 5. Blackbox exported by SynDSP used in
a SysGen model
Summary
The Synplify DSP tool enables modeling and algorithm implementation
from a much higher level of abstraction which achieves significant benefits
in productivity and design reuse.
There are several approaches to integrate these algorithms into
larger systems. Xilinx’s System Generator is a popular flow for created
embedded system designs. Synplify DSP designs can quickly and easily integrate
into this flow using the automated Export to System Generator feature. This
provides Xilinx users a complete system integration solution for DSP-based FPGA
designs.