Skip to Content
Find More Like This
Return to Search

Dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job

United States Patent

8,140,889
March 20, 2012
View the Complete Patent at the US Patent & Trademark Office
Methods, systems, and products for dynamically reassigning a connected node to a block of compute nodes for re-launching a failed job that include: identifying that a job failed to execute on the block of compute nodes because connectivity failed between a compute node assigned as at least one of the connected nodes for the block of compute nodes and its supporting I/O node; and re-launching the job, including selecting an alternative connected node that is actively coupled for data communications with an active I/O node; and assigning the alternative connected node as the connected node for the block of compute nodes running the re-launched job.
Budnik; Thomas A. (Rochester, MN), Knudson; Brant L. (Rochester, MN), Megerian; Mark G. (Rochester, MN), Miller; Samuel J. (Rochester, MN), Stockdell; William M. (Byron, MN)
International Business Machines Corporation (Armonk, NY)
12/ 861,426
August 23, 2010
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR DEVELOPMENT This invention was made with Government support under Contract No. B554331 awarded by the Department of Energy. The Government has certain rights in this invention.