METHOD AND APPARATUS FOR PREDICTING KERNEL TUNING PARAMETERS

    公开(公告)号:US20210065051A1

    公开(公告)日:2021-03-04

    申请号:US16560954

    申请日:2019-09-04

    Abstract: A processing device, which improves processing performance, is provided which comprises memory configured to store data and a processor, in communication with the memory. The processor is configured to receive tuning parameters, each having a numeric value, for executing a portion of a program on an identified hardware device and convert the numeric values of the tuning parameters to words. The processor is also configured to predict, using one or more machine language learning algorithms, which combination of the words to execute the portion of the program on the identified hardware device based on performance efficiency and convert the predicted combination of the words to corresponding numeric values for executing the portion of the program on the identified hardware device.

    Composable neural network kernels

    公开(公告)号:US12190225B2

    公开(公告)日:2025-01-07

    申请号:US16779557

    申请日:2020-01-31

    Abstract: A technique for manipulating a generic tensor is provided. The technique includes receiving a first request to perform a first operation on a generic tensor descriptor associated with the generic tensor, responsive to the first request, performing the first operation on the generic tensor descriptor, receiving a second request to perform a second operation on generic tensor raw data associated with the generic tensor, and responsive to the second request, performing the second operation on the generic tensor raw data.

    Method and apparatus for predicting kernel tuning parameters

    公开(公告)号:US12033035B2

    公开(公告)日:2024-07-09

    申请号:US16560954

    申请日:2019-09-04

    CPC classification number: G06N20/00 G06F11/3409 G06N3/02

    Abstract: A processing device, which improves processing performance, is provided which comprises memory configured to store data and a processor, in communication with the memory. The processor is configured to receive tuning parameters, each having a numeric value, for executing a portion of a program on an identified hardware device and convert the numeric values of the tuning parameters to words. The processor is also configured to predict, using one or more machine language learning algorithms, which combination of the words to execute the portion of the program on the identified hardware device based on performance efficiency and convert the predicted combination of the words to corresponding numeric values for executing the portion of the program on the identified hardware device.

    COMPOSABLE NEURAL NETWORK KERNELS

    公开(公告)号:US20210117806A1

    公开(公告)日:2021-04-22

    申请号:US17138709

    申请日:2020-12-30

    Abstract: A technique for manipulating a generic tensor is provided. The technique includes receiving a first request to perform a first operation on a generic tensor descriptor associated with the generic tensor, responsive to the first request, performing the first operation on the generic tensor descriptor, receiving a second request to perform a second operation on generic tensor raw data associated with the generic tensor, and responsive to the second request, performing the second operation on the generic tensor raw data, the performing the second operation including mapping a tensor coordinate specified by the second request to a memory address, the mapping including evaluating a delta function to determine an address delta value to add to a previously determined address for a previously processed tensor coordinate.

Patent Agency Ranking