Salmon, John K. (1991) Parallel hierarchical N-body methods. Dissertation (Ph.D.), California Institute of Technology. http://resolver.caltech.edu/CaltechTHESIS:04112011-113813016
Recent algorithmic advances utilizing hierarchical data structures have resulted in a dramatic reduction in the time required for computer simulation of N-body systems with long-range interactions. Computations which required O(N^2) operations can now be done in O(N log N) or O(N). We review these tree methods and find that they may be distinguished based on a few simple features. The Barnes-Hut (BH) algorithm has received a great deal of attention, and is the subject of the remainder of the dissertation. We present a generalization of the BH tree and analyze the statistical properties of such trees in detail. We also consider the expected number of operations entailed by an execution of the BH algorithm. We find an optimal value for m, the maximum number of bodies in a terminal cell, and confirm that the number of operations is O(N log N), even if the distribution of bodies is not uniform. The mathematical basis of all hierarchical methods is the multipole approximation. We discuss multipole approximations, for the case of arbitrary, spherically symmetric, and Newtonian Green's functions. We describe methods for computing multipoles and evaluating multipole approximations in each of these cases, emphasizing the tradeoff between generality and algorithmic complexity. N-body simulations in computational astrophysics can require 10^6 or even more bodies. Algorithmic advances are not sufficient, in and of themselves, to make computations of this size feasible. Parallel computation offers, a priori, the necessary computational power in terms of speed and memory. We show how the BH algorithm can be adapted to execute in parallel. We use orthogonal recursive bisection to partition space. The logical communication structure that emerges is that of a hypercube. A local version of the BH tree is constructed in each processor by iteratively exchanging data along each edge of the logical hypercube. We obtain speedups in excess of 380 on a 512 processor system for simulations of galaxy mergers with 180000 bodies. We analyze the performance of the parallel version of the algorithm and find that the overhead is due primarily to interprocessor synchronization delays and redundant computation. Communication is not a significant factor.
|Item Type:||Thesis (Dissertation (Ph.D.))|
|Degree Grantor:||California Institute of Technology|
|Division:||Physics, Mathematics and Astronomy|
|Thesis Availability:||Public (worldwide access)|
|Defense Date:||6 December 1990|
|Default Usage Policy:||No commercial reproduction, distribution, display or performance rights in this work are provided.|
|Deposited By:||Tony Diaz|
|Deposited On:||11 Apr 2011 22:30|
|Last Modified:||26 Dec 2012 04:33|
- Final Version
See Usage Policy.
Repository Staff Only: item control page