Abstract:
A processor includes a core with locally-gated circuitry, a decode unit, a local power gate (LPG) coupled to the locally-gated circuitry, and an execution unit. The decode unit includes logic to decode a store broadcast instruction of a specified width. The LPG includes logic to selectively provide power to the locally-gated circuitry, activate power to a first portion of the locally-gated circuitry for execution of full cache-line memory operations, and deactivate power to a second portion of the locally-gated circuitry the locally-gated circuitry. The execution unit includes logic to execute, by the first portion of the locally-gated circuitry for execution of full cache-line memory operations, the store broadcast instruction, the store broadcast instruction to store data of the specified width to storage of the processor.
Abstract:
A processor includes a core with locally-gated circuitry, a decode unit, a local power gate (LPG) coupled to the locally-gated circuitry, and an execution unit. The decode unit includes logic to decode a store broadcast instruction of a specified width. The LPG includes logic to selectively provide power to the locally-gated circuitry, activate power to a first portion of the locally-gated circuitry for execution of full cache-line memory operations, and deactivate power to a second portion of the locally-gated circuitry the locally-gated circuitry. The execution unit includes logic to execute, by the first portion of the locally-gated circuitry for execution of full cache-line memory operations, the store broadcast instruction, the store broadcast instruction to store data of the specified width to storage of the processor.
Abstract:
In one embodiment, the present invention includes an instruction decoder that can receive an incoming instruction and a path select signal and decode the incoming instruction into a first instruction code or a second instruction code responsive to the path select signal. The two different instruction codes, both representing the same incoming instruction may be used by an execution unit to perform an operation optimized for different data lengths. Other embodiments are described and claimed.