摘要:
Systems and methods for distributing thread instructions in the pipeline of a multi-threading digital processor are disclosed. More particularly, hardware and software are disclosed for successively selecting threads in an ordered sequence for execution in the processor pipeline. If a thread to be selected cannot execute, then a complementary thread is selected for execution.
摘要:
A program is into at least two object files: one object file for each of the supported processor environments. During compilation, code characteristics, such as data locality, computational intensity, and data parallelism, are analyzed and recorded in the object file. During run time, the code characteristics are combined with runtime considerations, such as the current load on the processors and the size of the data being processed, to arrive at an overall value. The overall value is then used to determine which of the processors will be assigned the task. The values are assigned based on the characteristics of the various processors. For example, if one processor is better at handling intensive computations against large streams of data, programs that are highly computationally intensive and process large quantities of data are weighted in favor of that processor. The corresponding object is then loaded and executed on the assigned processor.
摘要:
The present invention provides for authenticating a message. A security function is performed upon the message. The message is sent to a target. The output of the security function is sent to the target. At least one publicly known constant is sent to the target. The received message is authenticated as a function of at least a shared key, the received publicly known constants, the security function, the received message, and the output of the security function. If the output of the security function received by the target is the same as the output generated as a function of at least the received message, the received publicly known constants, the security function, and the shared key, neither the message nor the constants have been altered.
摘要:
A method and system for encrypting and verifying the integrity of a message using a three-phase encryption process is provided. A source having a secret master key that is shared with a target receives the message and generates a random number. The source then generates: a first set of intermediate values from the message and the random number; a second set of intermediate values from the first set of values; and a cipher text from the second set of values. At the three phases, the values are generated using the encryption function of a block cipher encryption/decryption algorithm. The random number and the cipher text are transmitted to the target, which decrypts the cipher text by reversing the encryption process. The target verifies the integrity of the message by comparing the received random number with the random number extracted from the decrypted cipher text.
摘要:
A method and system for executing one or more remote procedure calls. In one embodiment, a method comprises the step of a processing unit issuing a plurality of commands to a corresponding DMA controller. One or more commands of the plurality of commands issued by the processing unit are to copy attached processing unit instructions associated with one or more Attached Processing Unit's (APU's) and data associated with the attached processing unit instructions from the shared memory to one or more APU's. The attached processing unit instructions may include instructions that enable the associated one or more APU's to perform one or more particular operations on the data. The method further comprises the DMA controller issuing an indication to the one or more APU's to perform the one or more operations on the data associated with the attached processing unit instructions. Instead of having the particular APU that completed its operation notify the corresponding processing unit of its completion of the operation, the DMA controller polls a status line of each of the one or more attached processing units to determine if any of the one or more attached processing units completed its operation. The DMA controller then copies the results of the operations after each of the one or more attached processing units completes its operation.
摘要:
An improved cell circuit for data readout with reduced number of read wordlines is provided in a memory block of a multiport memory array. The number of read wordlines is significantly reduced by using a decoder between the read wordlines and a multiplexer in the cell circuit. The memory block has a plurality of address inputs and stores a plurality of write data signals. In the cell circuit, the decoder receives as decoder inputs a subset of the address inputs and outputs a plurality of select signals. The multiplexer is coupled to the decoder to receive the select signals and select one of the write data signals based on the select signals. Additionally, the read wordlines are coupled to the decoder for carrying the subset of the address inputs to the decoder.
摘要:
According to one embodiment, a multiprocessing system includes a first processor, a second processor, and compare logic. The first processor is operable to compute first results responsive to instructions, the second processor is operable to compute second results responsive to the instructions, and the compare logic is operable to check at checkpoints for matching of the results. Each of the processors has a first register for storing one of the processor's results, and the register has a stack of shadow registers. The processor is operable to shift a current one of the processor's results from the first register into the top shadow register, so that an earlier one of the processor's results can be restored from one of the shadow registers to the first register responsive to the compare logic determining that the first and second results mismatch. It is advantageous that the shadow register stack is closely coupled to its corresponding register, which provides for fast restoration of results. In a further aspect of an embodiment, each processor has a signature generator and a signature storage unit. The signature generator and storage units are operable to cooperatively compute a cumulative signature for a sequence of the processor's results, and the processor is operable to store the cumulative signature in the signature storage unit pending the match or mismatch determination by the compare logic. The checking for matching of the results includes the compare logic comparing the cumulative signatures of each respective processor. It is faster, and therefore advantageous, to check respective cumulative signatures at intervals rather than to check each individual result.
摘要:
A condition code register architecture for supporting multiple execution units is disclosed. A master execution unit is coupled a master condition code register such that condition codes generated by the master execution unit are stored in the master condition code register. A non-master execution unit is coupled to a shadow condition code register such that condition codes generated by the non-master execution unit are stored in the shadow condition code register. A tag unit coupled to the master execution unit and the non-master execution unit such that an entry within the master condition code register can be read only when a corresponding entry within the tag unit is referenced to the master execution unit or the master condition code register.
摘要:
A multi-chip module is disclosed in which a first die connects to a second set of die via a set of C4 connections within a single package. Low resistivity signal posts are provided within the lateral separation between adjacent die in the second set of die. These signal posts are connectable to externally supplied power signals. The power signals provided to the signals posts are routed to circuits within the second set of die over relatively short metallization interconnects. The signal posts may be connected to thermally conductive via elements and the package may include heat spreaders on upper and lower package surfaces. The first die may comprise a DRAM while the second set of die comprise portions of a general purpose microprocessor. The power signals provided to the second set of die may be connected to a capacitor terminal in the first die to provide power signal decoupling.
摘要:
A method for performing address mapping for a memory within a computer system is disclosed. The memory is organized in multiple of memory banks, and each memory bank is identified by a respective bank number. A block address portion of a physical address is translated to a corresponding bank number and an associated internal bank address. The bank number is formed by concatenating an output from a first lookup table and an output from a second lookup table. The output from the first lookup table is obtained by a first and a second segments of the block address portion, while the output from the second lookup table is obtained by a third and a fourth segments of the block address portion. Data stored in a specific location within the memory banks can be accessed by the bank number and the associated internal bank address.