Abstract:
A processing device includes a processor core and a number of calculation modules that each is configurable to perform any one of operations for a convolutional neuron network system. A first set of the calculation modules are configured to perform convolution operations, a second set of the calculation modules are reconfigured to perform averaging operations, and a third set of the calculation modules are reconfigured to perform dot product operations.
Abstract:
Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.
Abstract:
A processing device includes a processor core and a number of calculation modules that each is configurable to perform any one of operations for a convolutional neuron network system. A first set of the calculation modules are configured to perform convolution operations, a second set of the calculation modules are reconfigured to perform averaging operations, and a third set of the calculation modules are reconfigured to perform dot product operations.
Abstract:
Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.
Abstract:
Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.
Abstract:
Responsive to execution of a computer instruction in a current translation window, state indicators associated with a cache line accessed for the execution may be modified. The state indicators may include: a first indicator to indicate whether the computer instruction is a load instruction moved from a subsequent translation window into the current translation window, a second indicator to indicate whether the cache line is modified in a cache responsive to the execution of the computer instruction, a third indicator to indicate whether the cache line is speculatively modified in the cache responsive to the execution of the computer instruction, a fourth indicator to indicate whether the cache line is speculatively loaded by the computer instruction, a fifth indicator to indicate whether a core executing the computer instruction exclusively owns the cache line, and a sixth indicator to indicate whether the cache line is invalid.
Abstract:
A processor includes a processor core including an execution unit to execute instructions, and a cache memory. The cache memory includes a controller to update each of a plurality of stale indicators in response to a lazy flush instruction. Each stale indicator is associated with respective data, and each updated stale indicator is to indicate that the respective data is stale. The cache memory also includes a plurality of cache lines. Each cache line is to store corresponding data and a foreground tag that includes a respective virtual address associated with the corresponding data, and that includes the associated stale indicator. Other embodiments are described as claimed.
Abstract:
Systems and methods for performing convolution operations. An example processing system comprises: a processing core; and a convolver unit to apply a convolution filter to a plurality of input data elements represented by a two-dimensional array, the convolver unit comprising a plurality of multipliers coupled to two or more sets of latches, wherein each set of latches is to store a plurality of data elements of a respective one-dimensional section of the two-dimensional array.
Abstract:
A computer-readable storage medium, method and system for optimization-level aware branch prediction is described. A gear level is assigned to a set of application instructions that have been optimized. The gear level is also stored in a register of a branch prediction unit of a processor. Branch prediction is then performed by the processor based upon the gear level.
Abstract:
A processing device includes a processor core and a number of calculation modules that each is configurable to perform any one of operations for a convolutional neuron network system. A first set of the calculation modules are configured to perform convolution operations, a second set of the calculation modules are reconfigured to perform averaging operations, and a third set of the calculation modules are reconfigured to perform dot product operations.