摘要:
In some embodiments, a method and apparatus for automatically parallelizing a sequential network application through pipeline transformation are described. In one embodiment, the method includes the configuration of a network processor into a D-stage processor pipeline. Once configured, a sequential network application program is transformed into D-pipeline stages. Once transformed, the D-pipeline stages are executed in parallel within the D-stage processor pipeline. In one embodiment, transformation of a sequential application program is performed by modeling the sequential network program as a flow network model and selecting from the flow network model into a plurality of preliminary pipeline stages. Other embodiments are described and claimed.
摘要:
A program may be partitioned into at least two stages, where at least one of the stages comprises more than one parallel thread. Data required by each of the stages, which data is defined in a previous stage may be identified. Transmission of the required data between consecutive stages may then be provided for.
摘要:
A program may be partitioned into at least two stages, where at least one of the stages comprises more than one parallel thread. Data required by each of the stages, which data is defined in a previous stage may be identified. Transmission of the required data between consecutive stages may then be provided for.
摘要:
A method is presented including assigning a first register class to at least one symbolic register in at least one instruction, determining and assigning a second register class to the at least one register, reducing register class fixups and renaming the at least one symbolic register. Also presented is a system including a processor having at least one register and a compiler executing in the processor that inputs a source program having many operation blocks. The compiler assigns a first register class in at least one instruction to at least one symbolic register, determines and assigns a second register class to the at least one symbolic register, reduces register class fixups, and renames the at least one symbolic register.
摘要:
A method is presented including assigning a first register class to at least one symbolic register in at least one instruction, determining and assigning a second register class to the at least one register, reducing register class fixups and renaming the at least one symbolic register. Also presented is a system including a processor having at least one register and a compiler executing in the processor that inputs a source program having many operation blocks. The compiler assigns a first register class in at least one instruction to at least one symbolic register, determines and assigns a second register class to the at least one symbolic register, reduces register class fixups, and renames the at least one symbolic register.
摘要:
In some embodiments, a method and apparatus for automatically parallelizing a sequential network application through pipeline transformation are described. In one embodiment, the method includes the configuration of a network processor into a D-stage processor pipeline. Once configured, a sequential network application program is transformed into D-pipeline stages. Once transformed, the D-pipeline stages are executed in parallel within the D-stage processor pipeline. In one embodiment, transformation of a sequential application program is performed by modeling the sequential network program as a flow network model and selecting from the flow network model into a plurality of preliminary pipeline stages. Other embodiments are described and claimed.
摘要:
In some embodiments, a method and apparatus for automatically parallelizing a sequential network application through pipeline transformation are described. In one embodiment, the method includes the configuration of a network processor into a D-stage processor pipeline. Once configured, a sequential network application program is transformed into D-pipeline stages. Once transformed, the D-pipeline stages are executed in parallel within the D-stage processor pipeline. In one embodiment, transformation of a sequential application program is performed by modeling the sequential network program as a flow network model and selecting from the flow network model into a plurality of preliminary pipeline stages. Other embodiments are described and claimed.
摘要:
In some embodiments, a method and apparatus for automatically parallelizing a sequential network application through pipeline transformation are described. In one embodiment, the method includes the configuration of a network processor into a D-stage processor pipeline. Once configured, a sequential network application program is transformed into D-pipeline stages. Once transformed, the D-pipeline stages are executed in parallel within the D-stage processor pipeline. In one embodiment, transformation of a sequential application program is performed by modeling the sequential network program as a flow network model and selecting from the flow network model into a plurality of preliminary pipeline stages. Other embodiments are described and claimed.
摘要:
In some embodiments, a method and apparatus for an automatic thread-partition compiler are described. In one embodiment, the method includes the transformation of a sequential application program into a plurality of application program threads. Once partitioned, the plurality of application program threads are concurrently executed as respective threads of a multi-threaded architecture. Hence, a performance improvement of the parallel multi-threaded architecture is achieved by hiding memory access latency through or by overlapping memory access with computations or with other memory accesses. Other embodiments are described and claimed.
摘要:
A method of scheduling a sequence of instructions is described. A target program is read, a pipeline control hazard is identified within the sequence of instructions, and a selected sequence of instructions is re-ordered. Two steps for re-ordering are applied to the selected sequence of instructions. First, a backward scheduling method is performed, and second, a forward scheduling method is performed.