Coupling and Data Analytics

Overview:

One of the most important aspects of the center's software suite is the coupling between two independent applications. A second goal is to understand the performance of common communication patterns used in data analysis on extreme-scale architectures. This suite of proxy apps for coupling and data analytics is called CIAN for CESAR integrated analytics suite.

  • We generate a coupled data model by transferring the solution back and forth between 2 meshes, for example a hexahedral mesh and a tetrahedral mesh with different numbers of elements. The meshes are created in memory on the fly. An analytically known solution is defined on the source mesh, projected onto the target mesh, and back to the source. The interpolation errors, along with time taken in various parts of the projection, are reported.
  • Parallel data analysis often uses neighbor exchanges and various global reductions. To test the performance of these communication patterns, the following proxy apps are available:
    • Neighbor exchange
    • Merge-based reduction
    • Swap-based reduction
    • All-to-all exchange
    • Parallel sample sort

Dependencies :

Accessing data models in parallel requires the help of several external libraries.

  • HDF5 for reading and writing meshes (only required for coupling proxy app)
  • MOAB for accessing the coupled mesh data model (only required for coupling proxy app)
  • DIY2 for comunication patterns and domain decomposition (required for all apps)

The README in the CIAN download explains how to download and install the dependencies.

Proxy Apps:

The CIAN suite contains two categories of proxy apps: (1) Coupling and  (2) Communication.

  • Coupling Proxy App

Coupling is essential to entire project and to multi-physics problems. The driving questions behind our coupling apps are: (1)What is the memory requirement to access data via the coupled interface, in terms of data and code? (2)How does it compare with access to the native interfaces of individual codes? (3)What is the performance and accuracy of coupling?

This proxy app does the following:

  • Generates two different test meshes in memory, domain decomposed over MPI processes
  • Transfers solution from first mesh to second (i.e., couples the meshes)
  • Computes error between transferred solution and original

The following input and output parameters are used in this app:

  • Inputs: different mesh types (hex FEM, tet FEM, SEM), same or different communicators, same or different processes for source and target mesh.
  • Outputs: timing, memory footprint, accuracy
  • Communication Proxy Apps

Interprocess communication is essential for parallel data analysis, and the cost of the communication often dominates extreme-scale computations. For example, the stencil computation below requires neighborhood exchange between the green, red, and blue processes.The driving questions behind our communication apps are: (1) What are the tradeoffs between latency and bandwidth at various data and system sizes? (2) How do communication algorithm parameters such as number of rounds and radix affect those tradeoffs?

There are four basic communication patterns. Unlike plain MPI, communication in CIAN happens between blocks, not processes (an MPI process can own several blocks), and blocks can be in or out of core.
  • Neighbor exchange: the number and size of messages exchanged between neighboring blocks is exercised
  • Merge-based reduction: the equivalent of MPI_Reduce, a tree-based merge reduction
  • Swap-based reduction: the equivalent of MPI_Reduce_scatter using a butterfly or radix-k swap algorithm
  • All-to-all exchange: the equivalent of MPI_Alltoall
One application of the above patterns is parallel sorting; hence, the fifth proxy app in the communcations suite is a parallel sample sort. We also compare performance with a parallel histogram sort.
 
The following input and output parameters are used in these apps:
  • Inputs: message size, number of messages, system size, algorithm parameters (number of rounds, radix)
  • Outputs: time

Download: 

CIAN is publicly available on github.

Contact: 

For more information and bug reporting, contact Tom Peterka tpeterka@mcs.anl.gov