Compute nodes perform allreduce operations that swap processes at nodes. A first allreduce operation generates a first result and uses a first process from a first compute node, a second process from a second compute node, and zeros from other compute nodes. The first compute node replaces the first process with the first result. A second allreduce operation generates a second result and uses the first result from the first compute node, the second process from the second compute node, and zeros from others. The second compute node replaces the second process with the second result, which is the first process. A third allreduce operation generates a third result and uses the first result from first compute node, the second result from the second compute node, and zeros from others. The first compute node replaces the first result with the third result, which is the second process.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT
This invention was made with Government support under Contract No. B519700 awarded by the Department of Energy. The Government has certain rights in this invention.