CaltechTHESIS
  A Caltech Library Service

SPICE2 -- a spatial parallel architecture for accelerating the spice circuit simulator

Citation

Kapre, Nachiket Ganesh (2010) SPICE2 -- a spatial parallel architecture for accelerating the spice circuit simulator. Dissertation (Ph.D.), California Institute of Technology. http://resolver.caltech.edu/CaltechTHESIS:10262010-082537998

This is the latest version of this item.

Abstract

Spatial processing of sparse, irregular floating-point computation using a single FPGA enables up to an order of magnitude speedup (mean 2.8X speedup) over a conventional microprocessor for the SPICE circuit simulator. We deliver this speedup using a hybrid parallel architecture that spatially implements the heterogeneous forms of parallelism available in SPICE. We decompose SPICE into its three constituent phases: Model-Evaluation, Sparse Matrix-Solve, and Iteration Control and parallelize each phase independently. We exploit data-parallel device evaluations in the Model-Evaluation phase, sparse dataflow parallelism in the Sparse Matrix-Solve phase and compose the complete design in streaming fashion. We name our parallel architecture SPICE2: Spatial Processors Interconnected for Concurrent Execution for accelerating the SPICE circuit simulator. We program the parallel architecture with a high-level, domain-specific framework that identifies, exposes and exploits parallelism available in the SPICE circuit simulator. This design is optimized with an auto-tuner that can scale the design to use larger FPGA capacities without expert intervention and can even target other parallel architectures with the assistance of automated code-generation. This FPGA architecture is able to outperform conventional processors due to a combination of factors including high utilization of statically-scheduled resources, low-overhead dataflow scheduling of fine-grained tasks, and overlapped processing of the control algorithms. We demonstrate that we can independently accelerate Model-Evaluation by a mean factor of 6.5X(1.4--23X) across a range of non-linear device models and Matrix-Solve by 2.4X(0.6--13X) across various benchmark matrices while delivering a mean combined speedup of 2.8X(0.2--11X) for the two together when comparing a Xilinx Virtex-6 LX760 (40nm) with an Intel Core i7 965 (45nm). With our high-level framework, we can also accelerate Single-Precision Model-Evaluation on NVIDIA GPUs, ATI GPUs, IBM Cell, and Sun Niagara 2 architectures. We expect approaches based on exploiting spatial parallelism to become important as frequency scaling slows down and modern processing architectures turn to parallelism (\eg multi-core, GPUs) due to constraints of power consumption. This thesis shows how to express, exploit and optimize spatial parallelism for an important class of problems that are challenging to parallelize.

Item Type:Thesis (Dissertation (Ph.D.))
Subject Keywords:fpga, spice, spatial, parallelism, pattern, vliw, dataflow, streaming, reconfigurable, architecture, auto-tuning,
Degree Grantor:California Institute of Technology
Division:Engineering and Applied Science
Major Option:Computer Science
Thesis Availability:Public (worldwide access)
Research Advisor(s):
  • DeHon, Andre
Thesis Committee:
  • Martin, Alain J. (chair)
  • Meiron, Daniel I.
  • Bruck, Jehoshua
  • Trimberger, Steven
  • DeHon, Andre
Defense Date:1 September 2010
Record Number:CaltechTHESIS:10262010-082537998
Persistent URL:http://resolver.caltech.edu/CaltechTHESIS:10262010-082537998
Default Usage Policy:No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:6159
Collection:CaltechTHESIS
Deposited By: Nachiket Kapre
Deposited On:01 Nov 2010 17:27
Last Modified:26 Dec 2012 04:32

Available Versions of this Item

  • SPICE2 -- a spatial parallel architecture for accelerating the spice circuit simulator. (deposited 01 Nov 2010 17:27) [Currently Displayed]

Thesis Files

[img]
Preview
PDF - Final Version
See Usage Policy.

4Mb

Repository Staff Only: item control page