Overview
The Zoltan library is a collection of data management services for parallel,
unstructured, adaptive, and dynamic applications. It simplifies the
load-balancing, data movement, unstructured communication, and memory usage
difficulties that arise in dynamic applications such as adaptive
finite-element methods, particle methods, and crash simulations.
Zoltan's data-structure neutral design also lets a wide range of applications
use it without imposing restrictions on application data structures.
Its object-based interface provides a simple and inexpensive way for
application developers to use the library and researchers to make new
capabilities available under a common interface.
Zoltan provides tools that help developers of parallel applications.
These tools are provided in an easy-to-use toolkit that is callable from C,
C++, and Fortran90.
Static Partitioning... |
Dynamic Load Balancing... |
Generally used as a pre-processor to an application. |
Runs side-by-side with an application. |
Can be (and usually is) implemented serially. |
Must be implemented in parallel. |
Has only modest concern for execution time. |
Must run quickly (time to load balance should not exceed time to run in
an unbalanced state). |
Has only modest concern for memory usage. |
Must use little memory (should not affect scalability of
application). |
Can use file-based interfaces (read geometry from a file; write partition
info to a file). |
Must use function-call interfaces. |
Has no dependence on an application's data structures. |
Needs information stored in an application's data structures. |
Accounts for part sizes and communication costs. |
Accounts for part sizes, communication costs, and data movement
costs. |
In our experience, no single partitioning strategy is effective for all
parallel computations. Some application require partitions based only on the
problem's workloads and geometry; others benefit from explicit consideration
of dependencies between data. Some applications require the highest quality
partitions possible, regardless of the cost to generate them; others can
sacrifice some quality so long as new partitions can be generated quickly.
For some applications, the cost to relocate data is prohibitively high, so
incremental partitioning algorithms are needed; other applications can
tolerate greater remapping costs. Most important, application developers
might not know in advance which strategies work best in their applications, so
the need a convenient means of comparing algorithms.
We provide two classes of parallel partitioning algorithms in the Zoltan
library:
A complicated part of dynamic repartitioning is the need to move data from old
processors to new ones. This data migration requires deletions and insertions
from the application data structures as well as communication between the
processors.
Unlike static applications, where communication patterns remain fixed
throughout the computation, dynamic applications can have complicated,
changing communication patterns. For example:
Dynamic applications often need to locate off-processor information. After
repartitioning, for example, a processor might need to rebuild ghost cells and
lists of data to be communicated. It might know which data it needs, but not
where the data are located.
Dynamic applications rely heavily on the ability to allocate and free memory
as needed. Memory leaks and invalid memory accesses are common to developing
software. Although many software development tools let users track memory
bugs, these tools are often not available on state-of-the-art parallel
computers.