![]() |
Partitioning-Under the HoodBy Ashok Kulkarni, Senior Technical Marketing Manager, Synplicity, Inc. Synplicity’s Certify product supports many customers who need to partition their ASIC into multiple FPGAs, which naturally raises questions about the design partitioning process. Why doesn’t my automatic partitioning work every time perfectly? How do I know what a “good” partition is and what drives a good ASIC partitioning? What does an automatic partitioner base its decisions on? To answer these and other ASIC partitioning questions requires some understanding of the partitioning problem in general. The need of partitioning exists in many practical applications, for example, in chip and circuit design, reliability of communication networks, transportation planning, and cluster analysis. While in practice there are some differences in how partitioning is performed across these applications, it turns out the theory is very broad-based and applies across all applications. This article will provide a basic overview of ASIC partitioning and how general partitioning theory is applied to it. Key factors in ASIC Partitioning In multi-FPGA ASIC prototyping, partitioning is used to divide the gates in an ASIC logic network into single-FPGA sized groups. Factors that affect the outcome of ASIC partitioning include the number of I/Os, gate capacity, performance, and interfaces to external components such as black box memories or mixed signal functions. Often these are inter-related, such as how in order to free up more I/Os designers might sacrifice performance by using pin multiplexing. Typically the I/O restrictions are the biggest constraint that has to be managed; you always run out of I/Os first. This is because FPGAs have a limited amount of I/Os relative to the amount of logic and routing available on-chip, dictated by Rent’s rule which states that N = tGr where N = number of I/Os, t = number of terminals per gate, G = number of gates, r = Rent’s constant. The quality of the partitioning step will often determine the performance and how many FPGAs parts are required to implement a particular design. Therefore a good partitioning is very critical for both performance and cost of the multi-FPGA ASIC prototype. Partitioning - Mainly heuristic and not theoretical To understand why partitioning is not always a push-button it is important to start with the algorithms that drive partitioning. Partitioning has been determined to be a NP (non-deterministic polynomial time) -complete problem which means a polynomial time algorithm does not exist for the problem and therefore it is inefficient to perform exhaustive search to find an optimal solution. Therefore automatic partitioning does not work every time predictably. Hence guidance to the tool from the user based on the design knowledge helps the partitioning process. Many heuristic algorithms have been proposed to solve the partitioning problem. They can be classified as deterministic and stochastic algorithms. A deterministic algorithm progresses towards the solution by making deterministic decisions. Stochastic, on the other hand, makes random decisions (coin tossing) in the search for a solution. The partitioning problem is often represented as a graph connectivity problem and is one of the classical subjects in graph theory. A brief definition of graph theory is that a graph is a set of objects called points or vertices connected by links called lines or edges; a site map for a website often shows a “graph” of web pages; each page is a vertex and links between one page and another are the lines. Graph theory involves creating a structure that can be extended by assigning a weight to each edge of the graph. By applying weights the graph can be directed to create relevant partitions; for example if the graph represents a road network, the weights could represent the length of each road; an alternative weighting might be based on traffic density on each road. In the case of ASIC partitioning those weights can be driven by a number of factors, including logical interconnect, gate count, shared resources, or other factors. Finding the minimum cut of an undirected edge-weighted graph is a fundamental approach to partitioning. It consists in finding a nontrivial partition of the graph’s vertex set V into two parts such that the cut weight, the sum of the weights of the edges connecting the two parts, is minimum. So a given graph is partitioned into two blocks such that the cost of nets cut is minimum (see figure 1). In other words, given a network consisting of a set of cells (modules) connected by a set of nets (signals), the min-cut bipartitioning problem consists of finding a partition of the set into two blocks A and B such that the number of nets which have cells in both blocks is minimal. There have been several algorithms proposed to arrive at efficient solution and often are based on Fiduccia-Mattheyses (FM) heuristic or some variation thereof. In the next section we will briefly describe the algorithm proposed by Fiduccia-Mattheyses.
Fiduccia-Mattheyses (FM) heuristic The Fiduccia-Mattheyses (FM) heuristic for partitioning hypergraphs is an iterative improvement algorithm. FM starts with a possibly random solution and changes the solution by a sequence of moves which are organized as passes. At the beginning of a pass, all vertices are free to move (unlocked), and each possible move is labeled with the immediate change to the cost it would cause; this is called the gain of the move (positive gains reduce solution cost, while negative gains increase it). Iteratively, a move with highest gain is selected and executed, and the moving vertex is locked, i.e., is not allowed to move again during that pass. Since moving a vertex can change gains of adjacent vertices, after a move is executed all affected gains are updated. Selection and execution of a best-gain move, followed by gain update, are repeated until every vertex is locked. Then, the best solution seen during the pass is adopted as the starting solution of the next pass. The algorithm terminates when a pass fails to improve solution quality. This short description highlights the iterative and non-deterministic nature of partitioning algorithms. (There is a large body of work done by Fiduccia- Mattheyes, and subsequent work by several others, and therefore the reader may wish to consult their original work for details, see http://www.eecs.umich.edu/~mazum/fmcut1.pdf and http://www.eecs.umich.edu/~imarkov/pubs/book/part_survey.pdf,) Summary So what factors should you consider to achieve the best partitioning results? Select an FPGA with highest gate count, maximum number of I/Os (to minimize the effects of Rent’s rule) and finally adopt both automatic (such as Synplicity’s Certify) and manual partitioning (with your design knowledge). Typically the user knowledge of the design entered manually gives automatic partitioning a better starting point to explore the possible solution space. In short, developing a bullet-proof auto-partitioner will always be a work in progress, but within Synplicity R&D we are constantly revising and improving on the heuristics driving auto-partitioning. |
|
From The Syndicated Q3, 2006,
published quarterly by Synplicity, Inc., www.synplicity.com. |