Efficient memory layout for enabling smart data compression in machine learning environments

    公开(公告)号:US10600147B2

    公开(公告)日:2020-03-24

    申请号:US15682795

    申请日:2017-08-22

    Abstract: A mechanism is described for facilitating efficient memory layout for enabling smart data compression in machine learning environments. A method of embodiments, as described herein, includes facilitating dividing an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by one or more processors of a computing device. The method may further include computing the primary multiple tiles into secondary multiple tiles compatible in size of a local buffer. The method may further include merging the multiple secondary multiple tiles into a final tile representing the image, and compressing the final tile.

    Machine learning accelerator architecture

    公开(公告)号:US10769526B2

    公开(公告)日:2020-09-08

    申请号:US15960851

    申请日:2018-04-24

    Abstract: An apparatus to facilitate acceleration of machine learning operations is disclosed. The apparatus comprises accelerator circuitry including a first set of processing elements to perform first computations including matrix multiplication operations, a second set of processing elements to perform second computations including sum of elements of weights and offset multiply operations and a third set of processing elements to perform third computations including sum of elements of inputs and offset multiply operations, wherein the second and third computations are performed in parallel with the first computations.

    EFFICIENT MEMORY LAYOUT FOR ENABLING SMART DATA COMPRESSION IN MACHINE LEARNING ENVIRONMENTS

    公开(公告)号:US20190066257A1

    公开(公告)日:2019-02-28

    申请号:US15682795

    申请日:2017-08-22

    Abstract: A mechanism is described for facilitating efficient memory layout for enabling smart data compression in machine learning environments. A method of embodiments, as described herein, includes facilitating dividing an initial tile representing an image into primary multiple tiles such that each tile of the primary multiple tiles is regarded as an independent image as processed by one or more processors of a computing device. The method may further include computing the primary multiple tiles into secondary multiple tiles compatible in size of a local buffer. The method may further include merging the multiple secondary multiple tiles into a final tile representing the image, and compressing the final tile.

Patent Agency Ranking