摘要:
A multi-mode multi-parallelism data exchange method and the device thereof are proposed to apply to a check node operator or a bit node operator. The proposed method comprises the steps of: duplicating part or all of an original shift data as a duplicated shift data; combining the original shift data and the duplicated shift data to form a data block; and using a data block as the unit to shift this data block so as to conveniently retrieve shift data from the shifted data block. With a maximum z factor circuit and duplication of part of data, specifications of different shift sizes can be supported. The functions of shifters of several sizes can therefore be accomplished with the minimum complexity.
摘要:
A multi-mode multi-parallelism data exchange method and the device thereof are proposed to apply to a check node operator or a bit node operator. The proposed method comprises the steps of: duplicating part or all of an original shift data as a duplicated shift data; combining the original shift data and the duplicated shift data to form a data block; and using a data block as the unit to shift this data block so as to conveniently retrieve shift data from the shifted data block. With a maximum z factor circuit and duplication of part of data, specifications of different shift sizes can be supported. The functions of shifters of several sizes can therefore be accomplished with the minimum complexity.
摘要:
An operating method applied to low density parity check (LDPC) decoders and the circuit thereof are proposed, in which original bit nodes are incorporated into check nodes for simultaneous operation. The bit node messages are generated according to the different between the newly generated check messages and the previously check node messages. The bit node messages can be updated immediately, and the decoder throughput can be improved. In the other way, the required memory of LDPC decoders can be effectively reduced, and the decoding speed can also be enhanced.
摘要:
An operating method and a circuit for low density parity check (LDPC) decoders, in which original bit nodes are incorporated into check nodes for simultaneous operation. The bit node messages are generated according to the difference between the newly generated check messages and the previous check node messages. The bit node messages can be updated immediately, and the decoder throughput can be improved. The required memory of LDPC decoders can be effectively reduced, and the decoding speed can also be enhanced.
摘要:
An optimization scheme for a directory-based cache coherence protocol for multistage interconnection network-based multiprocessors improves system performance by reducing network latency. The optimization scheme is scalable, targeting multiprocessor systems having a moderate number of processors. The modification of shared data is the dominant contributor to performance degradation in these systems. The directory-based cache coherence scheme uses an invalidation bus on the processor side of the network. The invalidation bus connects all the private caches in the system and processes the invalidation requests thereby eliminating the need to send invalidations across the network. In operation, a processor which attempts to modify data places an address of the data to be modified on the invalidation bus simultaneously with sending a store request for the data modification to the global directory and the global directory sends to the processor attempting to modify the data, in addition to the permission signal, a count of the number of invalidation acknowledgments the processor should receive.
摘要:
A combining switch 10 includes a two input multiplexer 12 which receives I and J inputs from data processors and directs one of the incoming messages, if there are no contentions of congestions at a switch output port 14 and a Quene FIFO 16 is empty, directly to the output port 14 for transmission to one of a plurality of memory modules. If the output port 14 is busy and the Queue 16 is empty the incoming message is routed to the Queue FIFO 16 for storage. If the Queue FIFO 16 is not empty the incoming message is first compared by a comparator 20 to all existing messages stored in the Queue FIFO 16 to determine if the incoming messasge is destined for a memory address which already has a queued message. If no match is determined by comparator 20 the incoming message is routed to the Queue FIFO 16 for storage. If comparator 20 determines that the memory address and operation type of the incoming message matches that of a message already stored in the Queue FIFO 16 both the incoming message and the queued message are applied to a message combining ALU 26. The ALU 26 generates a combined message which is stored at the same Queue 16 location as the queued message which generated a comparison match with the incoming message.
摘要:
A method of reducing false sharing in a shared memory system by enabling two caches to modify the same line at the same time. More specifically, with this invention a lock associated with a segment of shared memory is acquired, where the segment will then be used exclusively by processor of the shared memory system that has acquired the lock. For each line of the segment, an invalidation request is sent to a number of caches of the system. When a cache receives the invalidation request, it invalidates each line of the segment that is in the cache. When each line of the segment is invalidated, an invalidation acknowledgement is sent to the global directory. For each line of the segment that has been updated or modified, the update data is written back to main memory. Then, an acquire signal is sent to the requesting processor which then has exclusive use of the segment.
摘要:
A system and method for a fault-tolerant system for parallel networks which interconnect processors and the first of the parallel networks distributed in an Omega configuration and the second of the parallel networks distributed in a reversed Omega configuration.
摘要:
A method for assuring virtual atomic invalidation in a multilevel cache system wherein lower level cache locations store portions of a line stored at a higher level cache location. Upon receipt of an invalidation signal, the higher level cache location invalidates the line and places a HOLD bit on the invalidated line. Thereafter, the higher level cache sends invalidation signals to all lower level caches which store portions of the invalidated line. Each lower level cache invalidates its portion of the line and sets a HOLD bit on its portion of the line. The HOLD bits are reset after all line portion invalidations have been completed.
摘要:
A method of maintaining cache coherency in a shared memory multiprocessor system having a plurality of nodes, where each node itself is a shared memory multiprocessor. With this invention, an additional shared owner state is maintained so that if a cache at the highest level of cache memory in the system issues a read or write request to a cache line that misses the highest cache level of the system, then the owner of the cache line places the cache line on the bus interconnecting the highest level of cache memories.