摘要:
According to the invention, a method for processing data related to an array of elements is disclosed. In one embodiment, a method for processing data related to an array of elements is disclosed. In the process, a first value is loaded from a first location, and a second value is loaded from a second location. The first and second values are compared to each other. A predetermined value is optionally stored at a destination based upon the outcome of the comparison.
摘要:
A novel processor chip (10) having a processing core (12), at least one bank of memory (14), an I/O link (26) configured to communicate with other like processor chips or compatible I/O devices, a memory controller (20) in electrical communication with processing core (12) and memory (14), and a distributed shared memory controller (22) in electrical communication with memory controller (20) and I/O link (26). Distributed shared memory controller (22) is configured to control the exchange of data between processor chip (10) and the other processor chips or I/O devices. In addition, memory controller (20) is configured to receive memory requests from processing core (12) and distributed shared memory controller (22) and process the memory request with memory (14). Processor chip (10) may further comprise an external memory interface (24) in electrical communication with memory controller (20). External memory interface (24) is configured to connect processor chip (10) with external memory, such as DRAM. Memory controller (20) is configured to receive memory requests from processing core (12) and distributed shared memory controller (22), determine whether the memory requests are directed to memory (14) on chip (10) or the external memory, and process the memory requests with memory (14) on processor chip (10) or with the external memory through external memory interface (24).
摘要:
According to the invention, a process for averaging two pixel values is disclosed. In one step, an instruction is decoded. A plurality of first operands is loaded from a first input register. A plurality of second operands is loaded from a second input register. An average of one of the plurality of first operands and one of the plurality of second operands is produced. The average is stored in an output register.
摘要:
According to the invention, a processing core is disclosed that includes a first source register, a number of second operands, a destination register, and a number of arithmetic processors. A bitwise inverter is coupled to at least one of the first number of operands and the second number of operands. The first source register includes a plurality of first operands and the destination register includes a plurality of results. The number of arithmetic processors are respectively coupled to the first operands, second operands and results, wherein each arithmetic processor computes one of a sum and a difference of the first operand and a respective second operand.
摘要:
According to the invention, a processing core is disclosed. The processing core includes one or more processing pipelines and a number of register flies. The processing pipelines having a total of N-number of processing paths, where each of the processing paths processes instructions on M-bit data words. Each of the number of register files has Q-number of registers that are each M-bits wide. The Q-number of registers within each of the plurality of register files are either private or global registers. When a value is written to one of said Q-number of said registers, which is a global register within one of said number of register files, the value is propagated to a corresponding global register in the other of the number of register files. When a value is written to one of said Q-number of the registers, which is a private register within one of said number of register files, the value is not propagated to a corresponding register in the other of said number of register files.
摘要:
A microprocessor with reduced context switching overhead and a corresponding method is disclosed. The microprocessor comprises a working register file that comprises dirty bit registers and working registers. The working registers including one or more corresponding working registers for each of the dirty bit registers. The microprocessor also comprises a decoder unit that is configured to decode an instruction that has a dirty bit register field specifying a selected dirty bit register of the dirty bit registers. The decoder unit is configured to generate decode signals in response. Furthermore, the working register file is configured to cause the selected dirty bit register to store a new dirty bit in response to the decode signals. The new dirty bit indicates that each operand stored by the one or more corresponding working registers is inactive and no longer needs to be saved to memory if a new context switch occurs.
摘要:
An integrated processor/memory device comprising a main memory, a CPU, and a full width cache. The main memory comprises main memory banks. Each of the main memory banks stores rows of words. The rows are a predetermined number of words wide. The cache comprises cache banks. Each of the cache banks stores one or more cache lines of words. Each of the cache lines has a corresponding row in the corresponding main memory bank. The cache lines are the predetermined number of words wide. When the CPU issues an address in the address space of the corresponding main memory bank, the cache bank determines from the address and the tags of the cache lines whether a cache bank hit or a cache miss has occurred in the cache bank. When a cache bank miss occurs, the cache bank replaces a victim cache line of the cache lines with a new cache line that comprises the corresponding row of the corresponding memory bank specified by the issued address.
摘要:
In one embodiment, an apparatus includes a network management module configured to execute at a network device operatively coupled to a switch fabric. The network management module is configured to receive a first set of configuration information associated with a subset of network resources from a set of network resources, the set of network resources being included in a virtual local area network from a plurality of virtual local area networks, the plurality of virtual local area networks being defined within the switch fabric. The first set of configuration information dynamically includes at least a second set of configuration information associated with the set of network resources.
摘要:
A method and apparatus for delivering a device driver to an operating system without user intervention. One or more operating systems (e.g., different operating system programs, different versions of one operating system) execute on a computer platform. During booting of an operating system a device is identified for which a driver is needed. The driver is requested from a service processor of the platform, which includes memory or storage for storing multiple device drivers (or multiple versions of one driver, for different operating systems). The driver is retrieved from the service processor's storage and delivered to the operating system.
摘要:
A plurality of processors on a chip is operated in lockstep. A crossbar switch on the chip couples and decouples the plurality of processors to a plurality of banks in a level-two (L2) cache. As data is stored in a first bank of the L2 cache, the old data at that location is passed through the crossbar switch to a second bank of the L2 cache that is functioning as a first-in-first-out memory (FIFO). Thus, new data is cached at a location in the first bank of the level-two cache, i.e., stored, and old data, from that location, is logged in the second bank of the level-two cache. The logged data in the second bank is used to restore the first bank to a known prior state when necessary.