Increased precision neural processing element

    公开(公告)号:US11604972B2

    公开(公告)日:2023-03-14

    申请号:US16457828

    申请日:2019-06-28

    摘要: Neural processing elements are configured with a hardware AND gate configured to perform a logical AND operation between a sign extend signal and a most significant bit (“MSB”) of an operand. The state of the sign extend signal can be based upon a type of a layer of a deep neural network (“DNN”) that generate the operand. If the sign extend signal is logical FALSE, no sign extension is performed. If the sign extend signal is logical TRUE, a concatenator concatenates the output of the hardware AND gate and the operand, thereby extending the operand from an N-bit unsigned binary value to an N+1 bit signed binary value. The neural processing element can also include another hardware AND gate and another concatenator for processing another operand similarly. The outputs of the concatenators for both operands are provided to a hardware binary multiplier.

    Managing workloads of a deep neural network processor

    公开(公告)号:US11494237B2

    公开(公告)日:2022-11-08

    申请号:US16454026

    申请日:2019-06-26

    IPC分类号: G06F9/46 G06F9/50 G06N3/04

    摘要: A computing system includes processor cores for executing applications that utilize functionality provided by a deep neural network (“DNN”) processor. One of the cores operates as a resource and power management (“RPM”) processor core. When the RPM processor receives a request to execute a DNN workload, it divides the DNN workload into workload fragments. The RPM processor then determines whether a workload fragment is to be statically allocated or dynamically allocated to a DNN processor. Once the RPM processor has selected a DNN processor, the RPM enqueues the workload fragment on a queue maintained by the selected DNN processor. The DNN processor dequeues workload fragments from its queue for execution. Once execution of a workload fragment has completed, the DNN processor generates an interrupt indicating that execution of the workload fragment has completed. The RPM processor can then notify the processor core that originally requested execution of the workload fragment.

    Placing and solving constraints on a 3D environment

    公开(公告)号:US10748346B2

    公开(公告)日:2020-08-18

    申请号:US16003985

    申请日:2018-06-08

    摘要: Systems and methods are disclosed for permitting the use of a natural language expression to specify object (or asset) locations in a virtual three-dimensional (3D) environment. By rapidly identifying and solving constraints for 3D object placement and orientation, consumers of synthetics services may more efficiently generate experiments for use in development of artificial intelligence (AI) algorithms and sensor platforms. Parsing descriptive location specifications, sampling the volumetric space, and solving pose constraints for location and orientation, can produce large numbers of designated coordinates for object locations in virtual environments with reduced demands on user involvement. Converting from location designations that are natural to humans, such as “standing on the floor one meter from a wall, facing the center of the room” to a six-dimensional (6D) pose specification (including 3-D location and orientation) can alleviate the need for a manual drag/drop/reorient procedure for placement of objects in a synthetic environment.

    Reducing power consumption in a neural network environment using data management

    公开(公告)号:US10996739B2

    公开(公告)日:2021-05-04

    申请号:US15847785

    申请日:2017-12-19

    摘要: Techniques to provide for improved (i.e., reduced) power consumption in an exemplary neural network (NN) and/or Deep Neural Network (DNN) environment using data management. Improved power consumption in the NN/DNN may be achieved by reducing a number of bit flips needed to process operands associated with one or more storages. Reducing the number bit flips associated with the NN/DNN may be achieved by multiplying an operand associated with a first storage with a plurality of individual operands associated with a plurality of kernels of the NN/DNN. The operand associated with the first storage may be neuron input data and the plurality of individual operands associated with the second storage may be weight values for multiplication with the neuron input data. The plurality of kernels may be arranged or sorted and subsequently processed in a manner that improves power consumption in the NN/DNN.