Citation
Kapre, Nachiket Ganesh (2010) SPICE2 -- a spatial parallel architecture for accelerating the spice circuit simulator. Dissertation (Ph.D.), California Institute of Technology. http://resolver.caltech.edu/CaltechTHESIS:10262010-082537998
This is the latest version of this item.
Abstract
Spatial processing of sparse, irregular floating-point computation using a single FPGA enables up to an order of magnitude speedup (mean 2.8X speedup) over a conventional microprocessor for the SPICE circuit simulator. We deliver this speedup using a hybrid parallel architecture that spatially implements the heterogeneous forms of parallelism available in SPICE. We decompose SPICE into its three constituent phases: Model-Evaluation, Sparse Matrix-Solve, and Iteration Control and parallelize each phase independently. We exploit data-parallel device evaluations in the Model-Evaluation phase, sparse dataflow parallelism in the Sparse Matrix-Solve phase and compose the complete design in streaming fashion. We name our parallel architecture SPICE2: Spatial Processors Interconnected for Concurrent Execution for accelerating the SPICE circuit simulator. We program the parallel architecture with a high-level, domain-specific framework that identifies, exposes and exploits parallelism available in the SPICE circuit simulator. This design is optimized with an auto-tuner that can scale the design to use larger FPGA capacities without expert intervention and can even target other parallel architectures with the assistance of automated code-generation. This FPGA architecture is able to outperform conventional processors due to a combination of factors including high utilization of statically-scheduled resources, low-overhead dataflow scheduling of fine-grained tasks, and overlapped processing of the control algorithms. We demonstrate that we can independently accelerate Model-Evaluation by a mean factor of 6.5X(1.4--23X) across a range of non-linear device models and Matrix-Solve by 2.4X(0.6--13X) across various benchmark matrices while delivering a mean combined speedup of 2.8X(0.2--11X) for the two together when comparing a Xilinx Virtex-6 LX760 (40nm) with an Intel Core i7 965 (45nm). With our high-level framework, we can also accelerate Single-Precision Model-Evaluation on NVIDIA GPUs, ATI GPUs, IBM Cell, and Sun Niagara 2 architectures. We expect approaches based on exploiting spatial parallelism to become important as frequency scaling slows down and modern processing architectures turn to parallelism (\eg multi-core, GPUs) due to constraints of power consumption. This thesis shows how to express, exploit and optimize spatial parallelism for an important class of problems that are challenging to parallelize.
| Item Type: | Thesis (Dissertation (Ph.D.)) |
|---|---|
| Subject Keywords: | fpga, spice, spatial, parallelism, pattern, vliw, dataflow, streaming, reconfigurable, architecture, auto-tuning, |
| Degree Grantor: | California Institute of Technology |
| Division: | Engineering and Applied Science |
| Major Option: | Computer Science |
| Thesis Availability: | Public (worldwide access) |
| Research Advisor(s): |
|
| Thesis Committee: |
|
| Defense Date: | 1 September 2010 |
| Record Number: | CaltechTHESIS:10262010-082537998 |
| Persistent URL: | http://resolver.caltech.edu/CaltechTHESIS:10262010-082537998 |
| Default Usage Policy: | No commercial reproduction, distribution, display or performance rights in this work are provided. |
| ID Code: | 6159 |
| Collection: | CaltechTHESIS |
| Deposited By: | Nachiket Kapre |
| Deposited On: | 01 Nov 2010 17:27 |
| Last Modified: | 26 Dec 2012 04:32 |
Available Versions of this Item
- SPICE2 -- a spatial parallel architecture for accelerating the spice circuit simulator. (deposited 01 Nov 2010 17:27) [Currently Displayed]
Thesis Files
|
PDF
- Final Version
See Usage Policy. 4Mb |
Repository Staff Only: item control page


