Methods, apparatus and computer software product for source code optimization are provided. In an exemplary embodiment, a first custom computing apparatus is used to optimize the execution of source code on a second computing apparatus. In this embodiment, the first custom computing apparatus contains a memory, a storage medium and at least one processor with at least one multi-stage execution unit. The second computing apparatus contains at least two multi-stage execution units that allow for parallel execution of tasks. The first custom computing apparatus optimizes the code for parallelism, locality of operations and contiguity of memory accesses on the second computing apparatus. This Abstract is provided for the sole purpose of complying with the Abstract requirement rules. This Abstract is submitted with the explicit understanding that it will not be used to interpret or to limit the scope or the meaning of the claims.
This invention was made with Government support under contract no. DE-FG02-08ER85149 awarded by the Department of Energy, contract no. F30602-03-C-0033 and W31P4Q-07-C-0147 awarded by the .Defense Advanced Research Projects Agency, contract no. W9113M-07-C-0072 and W9113M-08-C-0146 awarded by the Missile Defense Agency, and contract no. FA8650-07-M-8129 awarded by the Office of the Secretary of Defense. The Government has certain rights in the invention.