RECONFIGURABLE PARALLEL PROCESSOR WITH STACKED COLUMNS FORMING A CIRCULAR DATA PATH

    公开(公告)号:US20240160602A1

    公开(公告)日:2024-05-16

    申请号:US17984351

    申请日:2022-11-10

    IPC分类号: G06F15/80 G06F9/30

    摘要: Processors, systems and methods are provided for thread level parallel processing. A processor may include a plurality of columns of vector processing units arranged in a two-dimensional column array with a plurality of column stacks placed side-by-side in a first direction and each column stack having two columns stacked in a second direction and a temporary storage buffer. Each column may include a processing element (PE) that has a vector Arithmetic Logic Unit (ALU) to perform arithmetic operations in parallel threads. At a first end of the column array in the first direction, two columns in the column stack are coupled to the temporary storage buffer for one-way data flow. At a second end of the column array in the first direction, two columns are coupled to each other for one-way data flow. The column array and the temporary storage buffer may form a one-way circular data path.

    Data-Driven Accelerator For Machine Learning And Raw Data Analysis

    公开(公告)号:US20170083827A1

    公开(公告)日:2017-03-23

    申请号:US14862408

    申请日:2015-09-23

    IPC分类号: G06N99/00

    CPC分类号: G06N20/00 G06F15/8092

    摘要: Embodiments include computing devices, apparatus, and methods implemented by the apparatus for accelerating machine learning on a computing device. Raw data may be received in the computing device from a raw data source device. The apparatus may identify key features as two dimensional matrices of the raw data such that the key features are mutually exclusive from each other. The key features may be translated into key feature vectors. The computing device may generate a feature vector from at least one of the key feature vectors. The computing device may receive a first partial output resulting from an execution of a basic linear algebra subprogram (BLAS) operation using the feature vector and a weight factor. The first partial output may be combined with a plurality of partial outputs to produce an output matrix. Receiving the raw data on the computing device may include receiving streaming raw data.

    APPARATUS, SYSTEMS, AND METHODS FOR LOW POWER COMPUTATIONAL IMAGING
    9.
    发明申请
    APPARATUS, SYSTEMS, AND METHODS FOR LOW POWER COMPUTATIONAL IMAGING 有权
    用于低功率计算成像的装置,系统和方法

    公开(公告)号:US20150046675A1

    公开(公告)日:2015-02-12

    申请号:US14458052

    申请日:2014-08-12

    IPC分类号: G06F15/78 G06F9/38 G06F13/28

    摘要: The present application discloses a computing device that can provide a low-power, highly capable computing platform for computational imaging. The computing device can include one or more processing units, for example one or more vector processors and one or more hardware accelerators, an intelligent memory fabric, a peripheral device, and a power management module. The computing device can communicate with external devices, such as one or more image sensors, an accelerometer, a gyroscope, or any other suitable sensor devices.

    摘要翻译: 本申请公开了一种计算设备,其可以提供用于计算成像的低功率,高能力的计算平台。 计算设备可以包括一个或多个处理单元,例如一个或多个向量处理器和一个或多个硬件加速器,智能存储器结构,外围设备和电源管理模块。 计算设备可以与诸如一个或多个图像传感器,加速度计,陀螺仪或任何其它合适的传感器设备的外部设备进行通信。