摘要:
A computer architecture and programming model for high speed processing over broadband networks are provided. The architecture employs a consistent modular structure, a common computing module and uniform software cells. The common computing module includes a control processor, a plurality of processing units, a plurality of local memories from which the processing units process programs, a direct memory access controller and a shared main memory. A synchronized system and method for the coordinated reading and writing of data to and from the shared main memory by the processing units also are provided. A hardware sandbox structure is provided for security against the corruption of data among the programs being processed by the processing units. The uniform software cells contain both data and applications and are structured for processing by any of the processors of the network. Each software cell is uniquely identified on the network.
摘要:
Mechanisms for modeling system level effects of soft errors are provided. Mechanisms are provided for integrating device-level and component-level soft error rate (SER) analysis mechanisms with micro-architecture level performance analysis tools during a concept phase of the IC design to thereby generate a SER analysis tool. A first SER profile for the IC design is generated by applying the SER analysis tool to the IC design. At a later phase of the IC design, detailed information about SER vulnerabilities of logic and storage elements within the IC design are obtained and the first SER profile is refined based on the detailed information about SER vulnerabilities to thereby generate a second SER profile for the IC design. Modifications to the IC design are made at one or more phases of the IC design based on one of the first SER profile or the second SER profile.
摘要:
Mechanisms for modeling system level effects of soft errors are provided. Mechanisms are provided for integrating device-level and component-level soft error rate (SER) analysis mechanisms with micro-architecture level performance analysis tools during a concept phase of the IC design to thereby generate a SER analysis tool. A first SER profile for the IC design is generated by applying the SER analysis tool to the IC design. At a later phase of the IC design, detailed information about SER vulnerabilities of logic and storage elements within the IC design are obtained and the first SER profile is refined based on the detailed information about SER vulnerabilities to thereby generate a second SER profile for the IC design. Modifications to the IC design are made at one or more phases of the IC design based on one of the first SER profile or the second SER profile.
摘要:
A method and apparatus are provided for moving at least one of instructions and operand data throughout a plurality of caches included in a multiprocessor computer system, wherein each of the plurality of caches is included in one of a plurality of processing nodes of the system so as to provide history-based movement of shared-data in coherent cache memories. A plurality of entries are stored in a consume after produce (CAP) table attached to each of the plurality of caches. Each of the entries is associated with a plurality of storage elements in one of the plurality of caches and includes information of prior usage of the plurality of storage elements by each of the plurality of processing nodes. Upon a miss by a processing node to a cache included therein, any storage elements that caused the miss are transferred to the cache from one of main memory and another cache. An entry is created in the table that is associated with the storage elements that caused the miss. A push prefetching engine may be used to create the entry.
摘要:
A system for instruction memory storage and processing in a computing device having a processor, the system is based on backwards branch control information and comprises a dynamic loop buffer (DLB) which is a tagless array of data organized as a direct-mapped structure; a DLB controller having a primary memory unit partitioned into a plurality of banks for controlling the state of the instruction memory system and accepting a program counter address as an input, the DLB controller outputs distinct signals. The system further comprises an address register located in the memory of the computing device, it is a staging register for the program counter address and an instruction fetch process that takes two cycles of the processor clock; and a bank select unit for serving as a program counter address decoder to accept the program counter address and to output a bank enable signal for selecting a bank in a primary memory unit, and a decoded address for access within the selected bank.
摘要:
A pipeline system and method includes a plurality of operational stages. The stages include a pointer register stage which stores pointer information and updates, and a rename and dependence checking stage located downstream of the pointer register stage, which renames registers and determines if dependencies exist. A functional unit provides pointer information updates to the pointer register stage such that pointer information is processed and updated to the pointer register stage before or in parallel with the register dependency checking.
摘要:
A pipeline system and method includes a plurality of operational stages. The stages include a pointer register stage which stores pointer information and updates, and a rename and dependence checking stage located downstream of the pointer register stage, which renames registers and determines if dependencies exist. A functional unit provides pointer information updates to the pointer register stage such that pointer information is processed and updated to the pointer register stage before or in parallel with the register dependency checking.
摘要:
There is provided a method for fetching at least one of instructions and operand data from a second memory into a first memory of a computer system having at least one processor. The method includes the step of storing a plurality of entries in a table associated with the first memory. Each entry is associated with a memory page that includes a plurality of storage elements in the second memory, and includes information of prior access by the at least one processor to each of the plurality of storage elements. Upon a miss to the first memory from the at least one processor based upon a request, the table is searched for a given entry associated with a given page that includes a target of the request. If the given entry is found, then at least one prefetch request is generated to fetch at least one storage element included in the given page from the second memory to the first memory, based upon given information comprised in the given entry.
摘要:
A method and system for attached processing units accessing a shared memory in an SMT system. In one embodiment, a system comprises a shared memory. The system further comprises a plurality of processing elements coupled to the shared memory. Each of the plurality of processing elements comprises a processing unit, a direct memory access controller and a plurality of attached processing units. Each direct memory access controller comprises an address translation mechanism thereby enabling each associated attached processing unit to access the shared memory in a restricted manner without an address translation mechanism. Each attached processing unit is configured to issue a request to an associated direct memory access controller to access the shared memory specifying a range of addresses to be accessed as virtual addresses. The associated direct memory access controller is configured to translate the range of virtual addresses into an associated range of physical addresses.
摘要:
A method and system for attached processing units accessing a shared memory in an SMP system. In one embodiment, a system comprises a shared memory. The system further comprises a plurality of processing elements coupled to the shared memory. Each of the plurality of processing elements comprises a processing unit, a direct memory access controller and a plurality of attached processing units. Each direct memory access controller comprises an address translation mechanism thereby enabling each associated attached processing unit to access the shared memory in a restricted manner without an address translation mechanism. Each attached processing unit is configured to issue a request to an associated direct memory access controller to access the shared memory specifying a range of addresses to be accessed as virtual addresses. The associated direct memory access controller is configured to translate the range of virtual addresses into an associated range of physical addresses.