Kapre, Nachiket Ganesh (2010) SPICE2 -- a spatial parallel architecture for accelerating the spice circuit simulator. Dissertation (Ph.D.), California Institute of Technology. http://resolver.caltech.edu/CaltechTHESIS:10262010-082537998
This is the latest version of this item.
Spatial processing of sparse, irregular floating-point computation using a single FPGA enables up to an order of magnitude speedup (mean 2.8X speedup) over a conventional microprocessor for the SPICE circuit simulator. We deliver this speedup using a hybrid parallel architecture that spatially implements the heterogeneous forms of parallelism available in SPICE. We decompose SPICE into its three constituent phases: Model-Evaluation, Sparse Matrix-Solve, and Iteration Control and parallelize each phase independently. We exploit data-parallel device evaluations in the Model-Evaluation phase, sparse dataflow parallelism in the Sparse Matrix-Solve phase and compose the complete design in streaming fashion. We name our parallel architecture SPICE2: Spatial Processors Interconnected for Concurrent Execution for accelerating the SPICE circuit simulator. We program the parallel architecture with a high-level, domain-specific framework that identifies, exposes and exploits parallelism available in the SPICE circuit simulator. This design is optimized with an auto-tuner that can scale the design to use larger FPGA capacities without expert intervention and can even target other parallel architectures with the assistance of automated code-generation. This FPGA architecture is able to outperform conventional processors due to a combination of factors including high utilization of statically-scheduled resources, low-overhead dataflow scheduling of fine-grained tasks, and overlapped processing of the control algorithms. We demonstrate that we can independently accelerate Model-Evaluation by a mean factor of 6.5X(1.4--23X) across a range of non-linear device models and Matrix-Solve by 2.4X(0.6--13X) across various benchmark matrices while delivering a mean combined speedup of 2.8X(0.2--11X) for the two together when comparing a Xilinx Virtex-6 LX760 (40nm) with an Intel Core i7 965 (45nm). With our high-level framework, we can also accelerate Single-Precision Model-Evaluation on NVIDIA GPUs, ATI GPUs, IBM Cell, and Sun Niagara 2 architectures. We expect approaches based on exploiting spatial parallelism to become important as frequency scaling slows down and modern processing architectures turn to parallelism (\eg multi-core, GPUs) due to constraints of power consumption. This thesis shows how to express, exploit and optimize spatial parallelism for an important class of problems that are challenging to parallelize.
|Item Type:||Thesis (Dissertation (Ph.D.))|
|Subject Keywords:||fpga, spice, spatial, parallelism, pattern, vliw, dataflow, streaming, reconfigurable, architecture, auto-tuning,|
|Degree Grantor:||California Institute of Technology|
|Division:||Engineering and Applied Science|
|Major Option:||Computer Science|
|Thesis Availability:||Public (worldwide access)|
|Defense Date:||1 September 2010|
|Default Usage Policy:||No commercial reproduction, distribution, display or performance rights in this work are provided.|
|Deposited By:||Nachiket Kapre|
|Deposited On:||01 Nov 2010 17:27|
|Last Modified:||26 Dec 2012 04:32|
Available Versions of this Item
- SPICE2 -- a spatial parallel architecture for accelerating the spice circuit simulator. (deposited 01 Nov 2010 17:27) [Currently Displayed]
- Final Version
See Usage Policy.
Repository Staff Only: item control page