Disclosed is a method and system for performing operations on at least one input data vector in order to produce at least one output vector to permit easy, scalable and fast programming of a petascale equivalent supercomputer. A PetaFlops Router may comprise one or more PetaFlops Nodes, which may be connected to each other and/or external data provider/consumers via a programmable crossbar switch external to the PetaFlops Node. Each PetaFlops Node has a FPGA and a programmable intra-FPGA crossbar switch that permits input and output variables to be configurably connected to various physical operators contained in the FPGA as desired by a user. This allows a user to specify the instruction set of the system on a per-application basis. Further, the intra-FPGA crossbar switch permits the output of one operation to be delivered as an input to a second operation. By configuring the external crossbar switch, the output of a first operation on a first PetaFlops Node may be used as the input for a second operation on a second PetaFlops Node. An embodiment may provide an ability for the system to recognize and generate pipelined functions. Streaming operators may be connected together at run-time and appropriately staged to allow data to flow through a series of functions. This allows the system to provide high throughput and parallelism when possible. The PetaFlops Router may implement the user desired instructions by appropriately configuring the intra-FPGA crossbar switch on each PetaFlops Node and the external crossbar switch.
STATEMENT REGARDING FEDERAL RIGHTS
This invention was made with government support under Contract No. DE-AC52-06NA25396 awarded by the U.S. Department of Energy. The government has certain rights in the invention.