SYSTEM FOR UNIVERSAL HARDWARE-NEURAL NETWORK ARCHITECTURE SEARCH (CO-DESIGN)

    公开(公告)号:US20220108054A1

    公开(公告)日:2022-04-07

    申请号:US17552955

    申请日:2021-12-16

    Abstract: An architecture search system evaluates a search space of neural network and hardware architectures with a plurality of candidate controllers. Each controller attempts to identify an optimized architecture using a different optimization algorithm. To identify a controller for the search space, the architecture search system samples subspaces of the search space having a portion of the neural network search space and a portion of the hardware search space. For each subspace, candidate controllers are scored with respect to the optimized design determined by the respective candidate controllers. Using the scores for the various candidate controllers across the sampled subspaces, a controller is selected to optimize the overall network architecture search space.

    Reconstruction of signals using a Gramian Matrix

    公开(公告)号:US20170185900A1

    公开(公告)日:2017-06-29

    申请号:US14998235

    申请日:2015-12-26

    CPC classification number: G06N20/00

    Abstract: An apparatus is described herein. The apparatus includes a clustering mechanism that is to partition a dictionary into a plurality of clusters. The apparatus also includes a feature-matching mechanism that is to pre-compute feature matching results for each cluster of the plurality of clusters. Moreover, the apparatus includes a selector that is to locate a best representative feature from the dictionary in response to an input vector.

    EFFICIENT TOKEN PRUNING IN TRANSFORMER-BASED NEURAL NETWORKS

    公开(公告)号:US20250124105A1

    公开(公告)日:2025-04-17

    申请号:US19002132

    申请日:2024-12-26

    Abstract: Key-value (KV) caching accelerates inference in large language models (LLMs) by allowing the attention operation to scale linearly rather than quadratically with the total sequence length. Due to large context lengths in modern LLMs, KV cache size can exceed the model size, which can negatively impact throughput. To address this issue, KVCrush, which stands for KEY-VALUE CACHE SIZE REDUCTION USING SIMILARITY IN HEAD-BEHAVIOR, is implemented. KVCrush involves using binary vectors to represent tokens, where the vector indicates which attention heads attend to the token and which attention heads disregard the token. The binary vectors are used in a hardware-efficient, low-overhead process to produce representatives for unimportant tokens to be pruned, without having to implement k-means clustering techniques.

    TIME BASED FRAME GENERATION VIA A TEMPORALLY AWARE MACHINE LEARNING MODEL

    公开(公告)号:US20240311950A1

    公开(公告)日:2024-09-19

    申请号:US18478233

    申请日:2023-09-29

    CPC classification number: G06T1/20 G06T3/18

    Abstract: Described herein is a graphics processor configured to perform time based frame generation via a temporally aware machine learning model that enables the generation of a frame at a target timestamp relative to the render times of input frames. For example, for an extrapolated frame generated by the temporally aware machine learning model, a low relative timestamp would indicate that the extrapolated frame will appear close in time after the final frame in a sequence of frames and should be relatively close in appearance to the final frame. A higher relative timestamp would indicate that the extrapolated frame should depict a greater degree of evolution based on the optical flow.

Patent Agency Ranking